Buscar | CGShares

Buscar

Entradas

Blogs

Usuarios

Páginas

Grupos

Visual Effects Society @VisualEffectsSociety Compartió un vínculo
2025-06-05 08:07:36 ·

HOLLYWOOD VFX TOOLS FOR SPACE EXPLORATION

By CHRIS McGOWAN

This image of Jupiter from NASA’s James Webb Space Telescope’s NIRCamshows stunning details of the majestic planet in infrared light.Special effects have been used for decades to depict space exploration, from visits to planets and moons to zero gravity and spaceships – one need only think of the landmark 2001: A Space Odyssey. Since that era, visual effects have increasingly grown in realism and importance. VFX have been used for entertainment and for scientific purposes, outreach to the public and astronaut training in virtual reality. Compelling images and videos can bring data to life. NASA’s Scientific Visualization Studioproduces visualizations, animations and images to help scientists tell stories of their research and make science more approachable and engaging.
A.J. Christensen is a senior visualization designer for the NASA Scientific Visualization Studioat the Goddard Space Flight Center in Greenbelt, Maryland. There, he develops data visualization techniques and designs data-driven imagery for scientific analysis and public outreach using Hollywood visual effects tools, according to NASA. SVS visualizations feature datasets from Earth-and space-based instrumentation, scientific supercomputer models and physical statistical distributions that have been analyzed and processed by computational scientists. Christensen’s specialties include working with 3D volumetric data, using the procedural cinematic software Houdini and science topics in Heliophysics, Geophysics and Astrophysics. He previously worked at the National Center for Supercomputing Applications’ Advanced Visualization Lab where he worked on more than a dozen science documentary full-dome films as well as the IMAX films Hubble 3D and A Beautiful Planet – and he worked at DNEG on the movie Interstellar, which won the 2015 Best Visual Effects Academy Award.

This global map of CO2 was created by NASA’s Scientific Visualization Studio using a model called GEOS, short for the Goddard Earth Observing System. GEOS is a high-resolution weather reanalysis model, powered by supercomputers, that is used to represent what was happening in the atmosphere.“The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video.”
—A.J. Christensen, Senior Visualization Designer, NASA Scientific Visualization StudioAbout his work at NASA SVS, Christensen comments, “The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video. This past year we were part of NASA’s total eclipse outreach efforts, we participated in all the major earth science and astronomy conferences, we launched a public exhibition at the Smithsonian Museum of Natural History called the Earth Information Center, and we posted hundreds of new visualizations to our publicly accessible website: svs.gsfc.nasa.gov.”

This is the ‘beauty shot version’ of Perpetual Ocean 2: Western Boundary Currents. The visualization starts with a rotating globe showing ocean currents. The colors used to color the flow in this version were chosen to provide a pleasing look.The Gulf Stream and connected currents.Venus, our nearby “sister” planet, beckons today as a compelling target for exploration that may connect the objects in our own solar system to those discovered around nearby stars.WORKING WITH DATA
While Christensen is interpreting the data from active spacecraft and making it usable in different forms, such as for science and outreach, he notes, “It’s not just spacecraft that collect data. NASA maintains or monitors instruments on Earth too – on land, in the oceans and in the air. And to be precise, there are robots wandering around Mars that are collecting data, too.”
He continues, “Sometimes the data comes to our team as raw telescope imagery, sometimes we get it as a data product that a scientist has already analyzed and extracted meaning from, and sometimes various sensor data is used to drive computational models and we work with the models’ resulting output.”

Jupiter’s moon Europa may have life in a vast ocean beneath its icy surface.HOUDINI AND OTHER TOOLS
“Data visualization means a lot of different things to different people, but many people on our team interpret it as a form of filmmaking,” Christensen says. “We are very inspired by the approach to visual storytelling that Hollywood uses, and we use tools that are standard for Hollywood VFX. Many professionals in our area – the visualization of 3D scientific data – were previously using other animation tools but have discovered that Houdini is the most capable of understanding and manipulating unusual data, so there has been major movement toward Houdini over the past decade.”

Satellite imagery from NASA’s Solar Dynamics Observatoryshows the Sun in ultraviolet light colorized in light brown. Seen in ultraviolet light, the dark patches on the Sun are known as coronal holes and are regions where fast solar wind gushes out into space.Christensen explains, “We have always worked with scientific software as well – sometimes there’s only one software tool in existence to interpret a particular kind of scientific data. More often than not, scientific software does not have a GUI, so we’ve had to become proficient at learning new coding environments very quickly. IDL and Python are the generic data manipulation environments we use when something is too complicated or oversized for Houdini, but there are lots of alternatives out there. Typically, we use these tools to get the data into a format that Houdini can interpret, and then we use Houdini to do our shading, lighting and camera design, and seamlessly blend different datasets together.”

While cruising around Saturn in early October 2004, Cassini captured a series of images that have been composed into this large global natural color view of Saturn and its rings. This grand mosaic consists of 126 images acquired in a tile-like fashion, covering one end of Saturn’s rings to the other and the entire planet in between.The black hole Gargantua and the surrounding accretion disc from the 2014 movie Interstellar.Another visualization of the black hole Gargantua.INTERSTELLAR & GARGANTUA
Christensen recalls working for DNEG on Interstellar. “When I first started at DNEG, they asked me to work on the giant waves on Miller’s ocean planet. About a week in, my manager took me into the hall and said, ‘I was looking at your reel and saw all this astronomy stuff. We’re working on another sequence with an accretion disk around a black hole that I’m wondering if we should put you on.’ And I said, ‘Oh yeah, I’ve done lots of accretion disks.’ So, for the rest of my time on the show, I was working on the black hole team.”
He adds, “There are a lot of people in my community that would be hesitant to label any big-budget movie sequence as a scientific visualization. The typical assumption is that for a Hollywood movie, no one cares about accuracy as long as it looks good. Guardians of the Galaxy makes it seem like space is positively littered with nebulae, and Star Wars makes it seem like asteroids travel in herds. But the black hole Gargantua in Interstellar is a good case for being called a visualization. The imagery you see in the movie is the direct result of a collaboration with an expert scientist, Dr. Kip Thorne, working with the DNEG research team using the actual Einstein equations that describe the gravity around a black hole.”

Thorne is a Nobel Prize-winning theoretical physicist who taught at Caltech for many years. He has reached wide audiences with his books and presentations on black holes, time travel and wormholes on PBS and BBC shows. Christensen comments, “You can make the argument that some of the complexity around what a black hole actually looks like was discarded for the film, and they admit as much in the research paper that was published after the movie came out. But our team at NASA does that same thing. There is no such thing as an objectively ‘true’ scientific image – you always have to make aesthetic decisions around whether the image tells the science story, and often it makes more sense to omit information to clarify what’s important. Ultimately, Gargantua taught a whole lot of people something new about science, and that’s what a good scientific visualization aims to do.”

The SVS produces an annual visualization of the Moon’s phase and libration comprising 8,760 hourly renderings of its precise size, orientation and illumination.FURTHER CHALLENGES
The sheer size of the data often encountered by Christensen and his peers is a challenge. “I’m currently working with a dataset that is 400GB per timestep. It’s so big that I don’t even want to move it from one file server to another. So, then I have to make decisions about which data attributes to keep and which to discard, whether there’s a region of the data that I can cull or downsample, and I have to experiment with data compression schemes that might require me to entirely re-design the pipeline I’m using for Houdini. Of course, if I get rid of too much information, it becomes very resource-intensive to recompute everything, but if I don’t get rid of enough, then my design process becomes agonizingly slow.”
SVS also works closely with its NASA partner groups Conceptual Image Laband Goddard Media Studiosto publish a diverse array of content. Conceptual Image Lab focuses more on the artistic side of things – producing high-fidelity renders using film animation and visual design techniques, according to NASA. Where the SVS primarily focuses on making data-based visualizations, CIL puts more emphasis on conceptual visualizations – producing animations featuring NASA spacecraft, planetary observations and simulations, according to NASA. Goddard Media Studios, on the other hand, is more focused towards public outreach – producing interviews, TV programs and documentaries. GMS continues to be the main producers behind NASA TV, and as such, much of their content is aimed towards the general public.

An impact crater on the moon.Image of Mars showing a partly shadowed Olympus Mons toward the upper left of the image.Mars. Hellas Basin can be seen in the lower right portion of the image.Mars slightly tilted to show the Martian North Pole.Christensen notes, “One of the more unique challenges in this field is one of bringing people from very different backgrounds to agree on a common outcome. I work on teams with scientists, communicators and technologists, and we all have different communities we’re trying to satisfy. For instance, communicators are generally trying to simplify animations so their learning goal is clear, but scientists will insist that we add text and annotations on top of the video to eliminate ambiguity and avoid misinterpretations. Often, the technologist will have to say we can’t zoom in or look at the data in a certain way because it will show the data boundaries or data resolution limits. Every shot is a negotiation, but in trying to compromise, we often push the boundaries of what has been done before, which is exciting.”
#hollywood #vfx #tools #space #exploration

HOLLYWOOD VFX TOOLS FOR SPACE EXPLORATION
By CHRIS McGOWAN This image of Jupiter from NASA’s James Webb Space Telescope’s NIRCamshows stunning details of the majestic planet in infrared light.Special effects have been used for decades to depict space exploration, from visits to planets and moons to zero gravity and spaceships – one need only think of the landmark 2001: A Space Odyssey. Since that era, visual effects have increasingly grown in realism and importance. VFX have been used for entertainment and for scientific purposes, outreach to the public and astronaut training in virtual reality. Compelling images and videos can bring data to life. NASA’s Scientific Visualization Studioproduces visualizations, animations and images to help scientists tell stories of their research and make science more approachable and engaging. A.J. Christensen is a senior visualization designer for the NASA Scientific Visualization Studioat the Goddard Space Flight Center in Greenbelt, Maryland. There, he develops data visualization techniques and designs data-driven imagery for scientific analysis and public outreach using Hollywood visual effects tools, according to NASA. SVS visualizations feature datasets from Earth-and space-based instrumentation, scientific supercomputer models and physical statistical distributions that have been analyzed and processed by computational scientists. Christensen’s specialties include working with 3D volumetric data, using the procedural cinematic software Houdini and science topics in Heliophysics, Geophysics and Astrophysics. He previously worked at the National Center for Supercomputing Applications’ Advanced Visualization Lab where he worked on more than a dozen science documentary full-dome films as well as the IMAX films Hubble 3D and A Beautiful Planet – and he worked at DNEG on the movie Interstellar, which won the 2015 Best Visual Effects Academy Award. This global map of CO2 was created by NASA’s Scientific Visualization Studio using a model called GEOS, short for the Goddard Earth Observing System. GEOS is a high-resolution weather reanalysis model, powered by supercomputers, that is used to represent what was happening in the atmosphere.“The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video.” —A.J. Christensen, Senior Visualization Designer, NASA Scientific Visualization StudioAbout his work at NASA SVS, Christensen comments, “The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video. This past year we were part of NASA’s total eclipse outreach efforts, we participated in all the major earth science and astronomy conferences, we launched a public exhibition at the Smithsonian Museum of Natural History called the Earth Information Center, and we posted hundreds of new visualizations to our publicly accessible website: svs.gsfc.nasa.gov.” This is the ‘beauty shot version’ of Perpetual Ocean 2: Western Boundary Currents. The visualization starts with a rotating globe showing ocean currents. The colors used to color the flow in this version were chosen to provide a pleasing look.The Gulf Stream and connected currents.Venus, our nearby “sister” planet, beckons today as a compelling target for exploration that may connect the objects in our own solar system to those discovered around nearby stars.WORKING WITH DATA While Christensen is interpreting the data from active spacecraft and making it usable in different forms, such as for science and outreach, he notes, “It’s not just spacecraft that collect data. NASA maintains or monitors instruments on Earth too – on land, in the oceans and in the air. And to be precise, there are robots wandering around Mars that are collecting data, too.” He continues, “Sometimes the data comes to our team as raw telescope imagery, sometimes we get it as a data product that a scientist has already analyzed and extracted meaning from, and sometimes various sensor data is used to drive computational models and we work with the models’ resulting output.” Jupiter’s moon Europa may have life in a vast ocean beneath its icy surface.HOUDINI AND OTHER TOOLS “Data visualization means a lot of different things to different people, but many people on our team interpret it as a form of filmmaking,” Christensen says. “We are very inspired by the approach to visual storytelling that Hollywood uses, and we use tools that are standard for Hollywood VFX. Many professionals in our area – the visualization of 3D scientific data – were previously using other animation tools but have discovered that Houdini is the most capable of understanding and manipulating unusual data, so there has been major movement toward Houdini over the past decade.” Satellite imagery from NASA’s Solar Dynamics Observatoryshows the Sun in ultraviolet light colorized in light brown. Seen in ultraviolet light, the dark patches on the Sun are known as coronal holes and are regions where fast solar wind gushes out into space.Christensen explains, “We have always worked with scientific software as well – sometimes there’s only one software tool in existence to interpret a particular kind of scientific data. More often than not, scientific software does not have a GUI, so we’ve had to become proficient at learning new coding environments very quickly. IDL and Python are the generic data manipulation environments we use when something is too complicated or oversized for Houdini, but there are lots of alternatives out there. Typically, we use these tools to get the data into a format that Houdini can interpret, and then we use Houdini to do our shading, lighting and camera design, and seamlessly blend different datasets together.” While cruising around Saturn in early October 2004, Cassini captured a series of images that have been composed into this large global natural color view of Saturn and its rings. This grand mosaic consists of 126 images acquired in a tile-like fashion, covering one end of Saturn’s rings to the other and the entire planet in between.The black hole Gargantua and the surrounding accretion disc from the 2014 movie Interstellar.Another visualization of the black hole Gargantua.INTERSTELLAR & GARGANTUA Christensen recalls working for DNEG on Interstellar. “When I first started at DNEG, they asked me to work on the giant waves on Miller’s ocean planet. About a week in, my manager took me into the hall and said, ‘I was looking at your reel and saw all this astronomy stuff. We’re working on another sequence with an accretion disk around a black hole that I’m wondering if we should put you on.’ And I said, ‘Oh yeah, I’ve done lots of accretion disks.’ So, for the rest of my time on the show, I was working on the black hole team.” He adds, “There are a lot of people in my community that would be hesitant to label any big-budget movie sequence as a scientific visualization. The typical assumption is that for a Hollywood movie, no one cares about accuracy as long as it looks good. Guardians of the Galaxy makes it seem like space is positively littered with nebulae, and Star Wars makes it seem like asteroids travel in herds. But the black hole Gargantua in Interstellar is a good case for being called a visualization. The imagery you see in the movie is the direct result of a collaboration with an expert scientist, Dr. Kip Thorne, working with the DNEG research team using the actual Einstein equations that describe the gravity around a black hole.” Thorne is a Nobel Prize-winning theoretical physicist who taught at Caltech for many years. He has reached wide audiences with his books and presentations on black holes, time travel and wormholes on PBS and BBC shows. Christensen comments, “You can make the argument that some of the complexity around what a black hole actually looks like was discarded for the film, and they admit as much in the research paper that was published after the movie came out. But our team at NASA does that same thing. There is no such thing as an objectively ‘true’ scientific image – you always have to make aesthetic decisions around whether the image tells the science story, and often it makes more sense to omit information to clarify what’s important. Ultimately, Gargantua taught a whole lot of people something new about science, and that’s what a good scientific visualization aims to do.” The SVS produces an annual visualization of the Moon’s phase and libration comprising 8,760 hourly renderings of its precise size, orientation and illumination.FURTHER CHALLENGES The sheer size of the data often encountered by Christensen and his peers is a challenge. “I’m currently working with a dataset that is 400GB per timestep. It’s so big that I don’t even want to move it from one file server to another. So, then I have to make decisions about which data attributes to keep and which to discard, whether there’s a region of the data that I can cull or downsample, and I have to experiment with data compression schemes that might require me to entirely re-design the pipeline I’m using for Houdini. Of course, if I get rid of too much information, it becomes very resource-intensive to recompute everything, but if I don’t get rid of enough, then my design process becomes agonizingly slow.” SVS also works closely with its NASA partner groups Conceptual Image Laband Goddard Media Studiosto publish a diverse array of content. Conceptual Image Lab focuses more on the artistic side of things – producing high-fidelity renders using film animation and visual design techniques, according to NASA. Where the SVS primarily focuses on making data-based visualizations, CIL puts more emphasis on conceptual visualizations – producing animations featuring NASA spacecraft, planetary observations and simulations, according to NASA. Goddard Media Studios, on the other hand, is more focused towards public outreach – producing interviews, TV programs and documentaries. GMS continues to be the main producers behind NASA TV, and as such, much of their content is aimed towards the general public. An impact crater on the moon.Image of Mars showing a partly shadowed Olympus Mons toward the upper left of the image.Mars. Hellas Basin can be seen in the lower right portion of the image.Mars slightly tilted to show the Martian North Pole.Christensen notes, “One of the more unique challenges in this field is one of bringing people from very different backgrounds to agree on a common outcome. I work on teams with scientists, communicators and technologists, and we all have different communities we’re trying to satisfy. For instance, communicators are generally trying to simplify animations so their learning goal is clear, but scientists will insist that we add text and annotations on top of the video to eliminate ambiguity and avoid misinterpretations. Often, the technologist will have to say we can’t zoom in or look at the data in a certain way because it will show the data boundaries or data resolution limits. Every shot is a negotiation, but in trying to compromise, we often push the boundaries of what has been done before, which is exciting.” #hollywood #vfx #tools #space #exploration

HOLLYWOOD VFX TOOLS FOR SPACE EXPLORATION

www.vfxvoice.com
By CHRIS McGOWAN This image of Jupiter from NASA’s James Webb Space Telescope’s NIRCam (Near-Infrared Camera) shows stunning details of the majestic planet in infrared light. (Image courtesy of NASA, ESA and CSA) Special effects have been used for decades to depict space exploration, from visits to planets and moons to zero gravity and spaceships – one need only think of the landmark 2001: A Space Odyssey (1968). Since that era, visual effects have increasingly grown in realism and importance. VFX have been used for entertainment and for scientific purposes, outreach to the public and astronaut training in virtual reality. Compelling images and videos can bring data to life. NASA’s Scientific Visualization Studio (SVS) produces visualizations, animations and images to help scientists tell stories of their research and make science more approachable and engaging. A.J. Christensen is a senior visualization designer for the NASA Scientific Visualization Studio (SVS) at the Goddard Space Flight Center in Greenbelt, Maryland. There, he develops data visualization techniques and designs data-driven imagery for scientific analysis and public outreach using Hollywood visual effects tools, according to NASA. SVS visualizations feature datasets from Earth-and space-based instrumentation, scientific supercomputer models and physical statistical distributions that have been analyzed and processed by computational scientists. Christensen’s specialties include working with 3D volumetric data, using the procedural cinematic software Houdini and science topics in Heliophysics, Geophysics and Astrophysics. He previously worked at the National Center for Supercomputing Applications’ Advanced Visualization Lab where he worked on more than a dozen science documentary full-dome films as well as the IMAX films Hubble 3D and A Beautiful Planet – and he worked at DNEG on the movie Interstellar, which won the 2015 Best Visual Effects Academy Award. This global map of CO2 was created by NASA’s Scientific Visualization Studio using a model called GEOS, short for the Goddard Earth Observing System. GEOS is a high-resolution weather reanalysis model, powered by supercomputers, that is used to represent what was happening in the atmosphere. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) “The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video.” —A.J. Christensen, Senior Visualization Designer, NASA Scientific Visualization Studio (SVS) About his work at NASA SVS, Christensen comments, “The NASA Scientific Visualization Studio operates like a small VFX studio that creates animations of scientific data that has been collected or analyzed at NASA. We are one of several groups at NASA that create imagery for public consumption, but we are also a part of the scientific research process, helping scientists understand and share their data through pictures and video. This past year we were part of NASA’s total eclipse outreach efforts, we participated in all the major earth science and astronomy conferences, we launched a public exhibition at the Smithsonian Museum of Natural History called the Earth Information Center, and we posted hundreds of new visualizations to our publicly accessible website: svs.gsfc.nasa.gov.” This is the ‘beauty shot version’ of Perpetual Ocean 2: Western Boundary Currents. The visualization starts with a rotating globe showing ocean currents. The colors used to color the flow in this version were chosen to provide a pleasing look. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) The Gulf Stream and connected currents. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Venus, our nearby “sister” planet, beckons today as a compelling target for exploration that may connect the objects in our own solar system to those discovered around nearby stars. (Image courtesy of NASA’s Goddard Space Flight Center) WORKING WITH DATA While Christensen is interpreting the data from active spacecraft and making it usable in different forms, such as for science and outreach, he notes, “It’s not just spacecraft that collect data. NASA maintains or monitors instruments on Earth too – on land, in the oceans and in the air. And to be precise, there are robots wandering around Mars that are collecting data, too.” He continues, “Sometimes the data comes to our team as raw telescope imagery, sometimes we get it as a data product that a scientist has already analyzed and extracted meaning from, and sometimes various sensor data is used to drive computational models and we work with the models’ resulting output.” Jupiter’s moon Europa may have life in a vast ocean beneath its icy surface. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) HOUDINI AND OTHER TOOLS “Data visualization means a lot of different things to different people, but many people on our team interpret it as a form of filmmaking,” Christensen says. “We are very inspired by the approach to visual storytelling that Hollywood uses, and we use tools that are standard for Hollywood VFX. Many professionals in our area – the visualization of 3D scientific data – were previously using other animation tools but have discovered that Houdini is the most capable of understanding and manipulating unusual data, so there has been major movement toward Houdini over the past decade.” Satellite imagery from NASA’s Solar Dynamics Observatory (SDO) shows the Sun in ultraviolet light colorized in light brown. Seen in ultraviolet light, the dark patches on the Sun are known as coronal holes and are regions where fast solar wind gushes out into space. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Christensen explains, “We have always worked with scientific software as well – sometimes there’s only one software tool in existence to interpret a particular kind of scientific data. More often than not, scientific software does not have a GUI, so we’ve had to become proficient at learning new coding environments very quickly. IDL and Python are the generic data manipulation environments we use when something is too complicated or oversized for Houdini, but there are lots of alternatives out there. Typically, we use these tools to get the data into a format that Houdini can interpret, and then we use Houdini to do our shading, lighting and camera design, and seamlessly blend different datasets together.” While cruising around Saturn in early October 2004, Cassini captured a series of images that have been composed into this large global natural color view of Saturn and its rings. This grand mosaic consists of 126 images acquired in a tile-like fashion, covering one end of Saturn’s rings to the other and the entire planet in between. (Image courtesy of ASA/JPL/Space Science Institute) The black hole Gargantua and the surrounding accretion disc from the 2014 movie Interstellar. (Image courtesy of DNEG and Paramount Pictures) Another visualization of the black hole Gargantua. (Image courtesy of DNEG and Paramount Pictures) INTERSTELLAR & GARGANTUA Christensen recalls working for DNEG on Interstellar (2014). “When I first started at DNEG, they asked me to work on the giant waves on Miller’s ocean planet [in the film]. About a week in, my manager took me into the hall and said, ‘I was looking at your reel and saw all this astronomy stuff. We’re working on another sequence with an accretion disk around a black hole that I’m wondering if we should put you on.’ And I said, ‘Oh yeah, I’ve done lots of accretion disks.’ So, for the rest of my time on the show, I was working on the black hole team.” He adds, “There are a lot of people in my community that would be hesitant to label any big-budget movie sequence as a scientific visualization. The typical assumption is that for a Hollywood movie, no one cares about accuracy as long as it looks good. Guardians of the Galaxy makes it seem like space is positively littered with nebulae, and Star Wars makes it seem like asteroids travel in herds. But the black hole Gargantua in Interstellar is a good case for being called a visualization. The imagery you see in the movie is the direct result of a collaboration with an expert scientist, Dr. Kip Thorne, working with the DNEG research team using the actual Einstein equations that describe the gravity around a black hole.” Thorne is a Nobel Prize-winning theoretical physicist who taught at Caltech for many years. He has reached wide audiences with his books and presentations on black holes, time travel and wormholes on PBS and BBC shows. Christensen comments, “You can make the argument that some of the complexity around what a black hole actually looks like was discarded for the film, and they admit as much in the research paper that was published after the movie came out. But our team at NASA does that same thing. There is no such thing as an objectively ‘true’ scientific image – you always have to make aesthetic decisions around whether the image tells the science story, and often it makes more sense to omit information to clarify what’s important. Ultimately, Gargantua taught a whole lot of people something new about science, and that’s what a good scientific visualization aims to do.” The SVS produces an annual visualization of the Moon’s phase and libration comprising 8,760 hourly renderings of its precise size, orientation and illumination. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) FURTHER CHALLENGES The sheer size of the data often encountered by Christensen and his peers is a challenge. “I’m currently working with a dataset that is 400GB per timestep. It’s so big that I don’t even want to move it from one file server to another. So, then I have to make decisions about which data attributes to keep and which to discard, whether there’s a region of the data that I can cull or downsample, and I have to experiment with data compression schemes that might require me to entirely re-design the pipeline I’m using for Houdini. Of course, if I get rid of too much information, it becomes very resource-intensive to recompute everything, but if I don’t get rid of enough, then my design process becomes agonizingly slow.” SVS also works closely with its NASA partner groups Conceptual Image Lab (CIL) and Goddard Media Studios (GMS) to publish a diverse array of content. Conceptual Image Lab focuses more on the artistic side of things – producing high-fidelity renders using film animation and visual design techniques, according to NASA. Where the SVS primarily focuses on making data-based visualizations, CIL puts more emphasis on conceptual visualizations – producing animations featuring NASA spacecraft, planetary observations and simulations, according to NASA. Goddard Media Studios, on the other hand, is more focused towards public outreach – producing interviews, TV programs and documentaries. GMS continues to be the main producers behind NASA TV, and as such, much of their content is aimed towards the general public. An impact crater on the moon. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Image of Mars showing a partly shadowed Olympus Mons toward the upper left of the image. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Mars. Hellas Basin can be seen in the lower right portion of the image. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Mars slightly tilted to show the Martian North Pole. (Image courtesy of NASA/Goddard Space Flight Center Scientific Visualization Studio) Christensen notes, “One of the more unique challenges in this field is one of bringing people from very different backgrounds to agree on a common outcome. I work on teams with scientists, communicators and technologists, and we all have different communities we’re trying to satisfy. For instance, communicators are generally trying to simplify animations so their learning goal is clear, but scientists will insist that we add text and annotations on top of the video to eliminate ambiguity and avoid misinterpretations. Often, the technologist will have to say we can’t zoom in or look at the data in a certain way because it will show the data boundaries or data resolution limits. Every shot is a negotiation, but in trying to compromise, we often push the boundaries of what has been done before, which is exciting.”

144

· 0 Commentarios ·0 Acciones ·0 Vista previa

Please log in to like, share and comment!
Marktechpost AI @MarktechpostAI Compartió un vínculo
2025-05-31 10:10:10 ·

Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration

State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent multimodal foundation models have advanced benchmarks for disciplinary knowledge and mathematical reasoning. However, these evaluations miss a crucial aspect of machine intelligence: physical reasoning, which requires integrating disciplinary knowledge, symbolic operations, and real-world constraints. Physical problem-solving differs fundamentally from pure mathematical reasoning as it demands models to decode implicit conditions in questions. For example, interpreting “smooth surface” as zero friction coefficient, and maintaining physical consistency across reasoning chains because physical laws remain constant regardless of reasoning trajectories.
MLLM shows excellent visual understanding by integrating visual and textual data across various tasks, motivating exploration of its reasoning abilities. However, uncertainty remains regarding whether these models possess genuine advanced reasoning capabilities for visual tasks, particularly in physical domains closer to real-world scenarios. Several LLM benchmarks have emerged to evaluate reasoning abilities, with PHYBench being most relevant for physics reasoning. MLLM scientific benchmarks, such as PhysReason and EMMA, contain multimodal physics problems with figures, however, they include only small physics subsets, which inadequately evaluate MLLMs’ capabilities for reasoning and solving advanced physics problems.
Researchers from the University of Hong Kong, the University of Michigan, the University of Toronto, the University of Waterloo, and the Ohio State University have proposed PHYX, a novel benchmark to evaluate the physical reasoning capabilities of foundation models. It comprises 3,000 visually-grounded physics questions, precisely curated across six distinct physics domains: Mechanics, Electromagnetism, Thermodynamics, Wave/Acoustics, Optics, and Modern Physics. It evaluates physics-based reasoning via multimodal problem-solving with three core innovations:3,000 newly collected questions with realistic physical scenarios requiring integrated visual analysis and causal reasoning,Expert-validated data design covering six fundamental physics domains, andStrict unified three-step evaluation protocols.

Researchers designed a four-stage data collection process to ensure high-quality data. The process begins with an in-depth survey of core physics disciplines to determine coverage across diverse domains and subfields, followed by the recruitment of STEM graduate students as expert annotators. They comply with copyright restrictions and avoid data contamination by selecting questions without answers that are immediately available. Moreover, quality control involves a three-stage cleaning process including duplicate detection through lexical overlap analysis with manual review by physics Ph.D. students, followed by filtering the shortest 10% of questions based on textual length, resulting in 3,000 high-quality questions from an initial collection of 3,300.

PHYX presents significant challenges for current models, with even the worst-performing human experts achieving 75.6% accuracy, outperforming all evaluated models and showing a gap between human expertise and current model capabilities. The benchmark reveals that multiple-choice formats narrow performance gaps by allowing weaker models to rely on surface-level cues, but open-ended questions demand genuine reasoning and precise answer generation. Comparing GPT-4o’s performance on PHYX to previously reported results on MathVista and MATH-V, lower accuracy in physical reasoning tasks emphasizes that physical reasoning requires deeper integration of abstract concepts and real-world knowledge, presenting greater challenges than purely mathematical contexts.
In conclusion, researchers introduced PHYX, the first large-scale benchmark for evaluating physical reasoning in multimodal, visually grounded scenarios. Rigorous evaluation reveals that state-of-the-art models show limitations in physical reasoning, relying predominantly on memorized knowledge, mathematical formulas, and superficial visual patterns rather than genuine understanding of physical principles. The benchmark focuses exclusively on English-language prompts and annotations, limiting assessment of multilingual reasoning abilities. Also, while images depict physically realistic scenarios, they are often schematic or textbook-style rather than real-world photographs, which may not fully capture the complexity of perception in natural environments.

Check out the Paper, Code and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement LearningSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Researchers Introduce MMLONGBENCH: A Comprehensive Benchmark for Long-Context Vision-Language Models
#multimodal #foundation #models #fall #short

Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration
State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent multimodal foundation models have advanced benchmarks for disciplinary knowledge and mathematical reasoning. However, these evaluations miss a crucial aspect of machine intelligence: physical reasoning, which requires integrating disciplinary knowledge, symbolic operations, and real-world constraints. Physical problem-solving differs fundamentally from pure mathematical reasoning as it demands models to decode implicit conditions in questions. For example, interpreting “smooth surface” as zero friction coefficient, and maintaining physical consistency across reasoning chains because physical laws remain constant regardless of reasoning trajectories. MLLM shows excellent visual understanding by integrating visual and textual data across various tasks, motivating exploration of its reasoning abilities. However, uncertainty remains regarding whether these models possess genuine advanced reasoning capabilities for visual tasks, particularly in physical domains closer to real-world scenarios. Several LLM benchmarks have emerged to evaluate reasoning abilities, with PHYBench being most relevant for physics reasoning. MLLM scientific benchmarks, such as PhysReason and EMMA, contain multimodal physics problems with figures, however, they include only small physics subsets, which inadequately evaluate MLLMs’ capabilities for reasoning and solving advanced physics problems. Researchers from the University of Hong Kong, the University of Michigan, the University of Toronto, the University of Waterloo, and the Ohio State University have proposed PHYX, a novel benchmark to evaluate the physical reasoning capabilities of foundation models. It comprises 3,000 visually-grounded physics questions, precisely curated across six distinct physics domains: Mechanics, Electromagnetism, Thermodynamics, Wave/Acoustics, Optics, and Modern Physics. It evaluates physics-based reasoning via multimodal problem-solving with three core innovations:3,000 newly collected questions with realistic physical scenarios requiring integrated visual analysis and causal reasoning,Expert-validated data design covering six fundamental physics domains, andStrict unified three-step evaluation protocols. Researchers designed a four-stage data collection process to ensure high-quality data. The process begins with an in-depth survey of core physics disciplines to determine coverage across diverse domains and subfields, followed by the recruitment of STEM graduate students as expert annotators. They comply with copyright restrictions and avoid data contamination by selecting questions without answers that are immediately available. Moreover, quality control involves a three-stage cleaning process including duplicate detection through lexical overlap analysis with manual review by physics Ph.D. students, followed by filtering the shortest 10% of questions based on textual length, resulting in 3,000 high-quality questions from an initial collection of 3,300. PHYX presents significant challenges for current models, with even the worst-performing human experts achieving 75.6% accuracy, outperforming all evaluated models and showing a gap between human expertise and current model capabilities. The benchmark reveals that multiple-choice formats narrow performance gaps by allowing weaker models to rely on surface-level cues, but open-ended questions demand genuine reasoning and precise answer generation. Comparing GPT-4o’s performance on PHYX to previously reported results on MathVista and MATH-V, lower accuracy in physical reasoning tasks emphasizes that physical reasoning requires deeper integration of abstract concepts and real-world knowledge, presenting greater challenges than purely mathematical contexts. In conclusion, researchers introduced PHYX, the first large-scale benchmark for evaluating physical reasoning in multimodal, visually grounded scenarios. Rigorous evaluation reveals that state-of-the-art models show limitations in physical reasoning, relying predominantly on memorized knowledge, mathematical formulas, and superficial visual patterns rather than genuine understanding of physical principles. The benchmark focuses exclusively on English-language prompts and annotations, limiting assessment of multilingual reasoning abilities. Also, while images depict physically realistic scenarios, they are often schematic or textbook-style rather than real-world photographs, which may not fully capture the complexity of perception in natural environments. Check out the Paper, Code and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement LearningSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Researchers Introduce MMLONGBENCH: A Comprehensive Benchmark for Long-Context Vision-Language Models #multimodal #foundation #models #fall #short

Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration

www.marktechpost.com
State-of-the-art models show human-competitive accuracy on AIME, GPQA, MATH-500, and OlympiadBench, solving Olympiad-level problems. Recent multimodal foundation models have advanced benchmarks for disciplinary knowledge and mathematical reasoning. However, these evaluations miss a crucial aspect of machine intelligence: physical reasoning, which requires integrating disciplinary knowledge, symbolic operations, and real-world constraints. Physical problem-solving differs fundamentally from pure mathematical reasoning as it demands models to decode implicit conditions in questions. For example, interpreting “smooth surface” as zero friction coefficient, and maintaining physical consistency across reasoning chains because physical laws remain constant regardless of reasoning trajectories. MLLM shows excellent visual understanding by integrating visual and textual data across various tasks, motivating exploration of its reasoning abilities. However, uncertainty remains regarding whether these models possess genuine advanced reasoning capabilities for visual tasks, particularly in physical domains closer to real-world scenarios. Several LLM benchmarks have emerged to evaluate reasoning abilities, with PHYBench being most relevant for physics reasoning. MLLM scientific benchmarks, such as PhysReason and EMMA, contain multimodal physics problems with figures, however, they include only small physics subsets, which inadequately evaluate MLLMs’ capabilities for reasoning and solving advanced physics problems. Researchers from the University of Hong Kong, the University of Michigan, the University of Toronto, the University of Waterloo, and the Ohio State University have proposed PHYX, a novel benchmark to evaluate the physical reasoning capabilities of foundation models. It comprises 3,000 visually-grounded physics questions, precisely curated across six distinct physics domains: Mechanics, Electromagnetism, Thermodynamics, Wave/Acoustics, Optics, and Modern Physics. It evaluates physics-based reasoning via multimodal problem-solving with three core innovations: (a) 3,000 newly collected questions with realistic physical scenarios requiring integrated visual analysis and causal reasoning, (b) Expert-validated data design covering six fundamental physics domains, and (c) Strict unified three-step evaluation protocols. Researchers designed a four-stage data collection process to ensure high-quality data. The process begins with an in-depth survey of core physics disciplines to determine coverage across diverse domains and subfields, followed by the recruitment of STEM graduate students as expert annotators. They comply with copyright restrictions and avoid data contamination by selecting questions without answers that are immediately available. Moreover, quality control involves a three-stage cleaning process including duplicate detection through lexical overlap analysis with manual review by physics Ph.D. students, followed by filtering the shortest 10% of questions based on textual length, resulting in 3,000 high-quality questions from an initial collection of 3,300. PHYX presents significant challenges for current models, with even the worst-performing human experts achieving 75.6% accuracy, outperforming all evaluated models and showing a gap between human expertise and current model capabilities. The benchmark reveals that multiple-choice formats narrow performance gaps by allowing weaker models to rely on surface-level cues, but open-ended questions demand genuine reasoning and precise answer generation. Comparing GPT-4o’s performance on PHYX to previously reported results on MathVista and MATH-V (both 63.8%), lower accuracy in physical reasoning tasks emphasizes that physical reasoning requires deeper integration of abstract concepts and real-world knowledge, presenting greater challenges than purely mathematical contexts. In conclusion, researchers introduced PHYX, the first large-scale benchmark for evaluating physical reasoning in multimodal, visually grounded scenarios. Rigorous evaluation reveals that state-of-the-art models show limitations in physical reasoning, relying predominantly on memorized knowledge, mathematical formulas, and superficial visual patterns rather than genuine understanding of physical principles. The benchmark focuses exclusively on English-language prompts and annotations, limiting assessment of multilingual reasoning abilities. Also, while images depict physically realistic scenarios, they are often schematic or textbook-style rather than real-world photographs, which may not fully capture the complexity of perception in natural environments. Check out the Paper, Code and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement LearningSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Researchers Introduce MMLONGBENCH: A Comprehensive Benchmark for Long-Context Vision-Language Models

14 Commentarios ·0 Acciones ·0 Vista previa

Please log in to like, share and comment!
Animation World Network @AnimationWorldNetwork Compartió un vínculo
2025-05-30 14:02:08 ·

‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time

Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey.
But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers.

But wait… there’s more!
The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film.
Needless to say, the studio truly became a “Master Builder” on the show.

The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds.
Here's the final trailer:

In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization.
Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation?
Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together.
Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell.
DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with?
SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmonand Jared Hess. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers.
KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying.

DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film?
SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld.
The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways.
When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game.
KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode.

DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size.
KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air.
After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on.
SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea.
KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys, swinging incense. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did. We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it.

DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work.
SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version.
KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with.
Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have.

DS: Did Unreal Engine come into play with her?
SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes.
Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments.
KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable.

DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?”
KE: Where do you begin?
SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question.
KE: I’ll start. I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.
SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of.

Dan Sarto is Publisher and Editor-in-Chief of Animation World Network.
#minecraft #movie #wētā #helps #adapt

‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time
Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey. But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers. But wait… there’s more! The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film. Needless to say, the studio truly became a “Master Builder” on the show. The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds. Here's the final trailer: In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization. Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation? Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together. Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell. DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with? SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmonand Jared Hess. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers. KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying. DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film? SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld. The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways. When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game. KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode. DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size. KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air. After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on. SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea. KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys, swinging incense. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did. We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it. DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work. SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version. KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with. Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have. DS: Did Unreal Engine come into play with her? SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes. Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments. KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable. DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?” KE: Where do you begin? SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question. KE: I’ll start. I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.   SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of. Dan Sarto is Publisher and Editor-in-Chief of Animation World Network. #minecraft #movie #wētā #helps #adapt

‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time

www.awn.com
Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey. But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers. But wait… there’s more! The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film. Needless to say, the studio truly became a “Master Builder” on the show. The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for $2.5 billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds. Here's the final trailer: In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization. Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation? Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together. Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell. DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with? SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmon [Production VFX Supervisor] and Jared Hess [the film’s director]. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers. KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying. DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film? SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld. The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways. When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game. KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode. DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size. KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air. After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on. SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea. KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys [the thurifers], swinging incense [a thurible]. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did. We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it. DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work. SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version. KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with. Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have. DS: Did Unreal Engine come into play with her? SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes. Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments. KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable. DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?” KE: Where do you begin? SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question. KE: I’ll start. I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.   SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of. Dan Sarto is Publisher and Editor-in-Chief of Animation World Network.

0 Commentarios ·0 Acciones ·0 Vista previa

Please log in to like, share and comment!
Marktechpost AI @MarktechpostAI Compartió un vínculo
2025-05-29 15:36:00 ·

This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10× Cost Efficiency

Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks are further complicated by the need for agents to adapt in dynamic web environments, where content can change frequently and where multimodal information, such as text and images, must be understood together.
A key problem in web navigation is the absence of reliable and detailed reward models that can guide agents in real-time. Existing methods primarily rely on multimodal large language modelslike GPT-4o and GPT-4o-mini as evaluators, which are expensive, slow, and often inaccurate, especially when handling long sequences of actions in multi-step tasks. These models use prompting-based evaluation or binary success/failure feedback but fail to provide step-level guidance, often leading to errors such as repeated actions or missing critical steps like clicking specific buttons or filling form fields. This limitation reduces the practicality of deploying web agents in real-world scenarios, where efficiency, accuracy, and cost-effectiveness are crucial.

The research team from Yonsei University and Carnegie Mellon University introduced WEB-SHEPHERD, a process reward model specifically designed for web navigation tasks. WEB-SHEPHERD is the first model to evaluate web navigation agents at the step level, using structured checklists to guide assessments. The researchers also developed the WEBPRM COLLECTION, a dataset of 40,000 step-level annotated web navigation tasks, and the WEBREWARDBENCH benchmark for evaluating PRMs. These resources were designed to enable WEB-SHEPHERD to provide detailed feedback by breaking down complex tasks into smaller, measurable subgoals.

WEB-SHEPHERD works by generating a checklist for each task based on the user’s instruction, such as “Search for product” or “Click on product page,” and evaluates the agent’s progress against these subgoals. The model uses next-token prediction to generate feedback and assigns rewards based on checklist completion. This process enables WEB-SHEPHERD to assess the correctness of each step with fine-grained judgment. The model estimates the reward for each step by combining the probabilities of “Yes,” “No,” and “In Progress” tokens and averages these across the checklist. This detailed scoring system enables agents to receive targeted feedback on their progress, enhancing their ability to navigate complex websites.
The researchers demonstrated that WEB-SHEPHERD significantly outperforms existing models. On the WEBREWARDBENCH benchmark, WEB-SHEPHERD achieved a Mean Reciprocal Rankscore of 87.6% and a trajectory accuracy of 55% in the text-only setting, compared to GPT-4o-mini’s 47.5% MRR and 0% trajectory accuracy without checklists. When tested in WebArena-lite using GPT-4o-mini as the policy model, WEB-SHEPHERD achieved a 34.55% success rate, which is 10.9 points higher than using GPT-4o-mini as the evaluator, while also being ten times more cost-efficient. In ablation studies, the researchers observed that WEB-SHEPHERD’s performance dropped significantly when checklists or feedback were removed, proving their importance for accurate reward assignments. They also showed that multimodal input, surprisingly, did not always improve performance and sometimes introduced noise.

This research highlights the critical role of detailed process-level rewards in building reliable web agents. The team’s work addresses the core challenge of web navigation—evaluating complex, multi-step actions—and offers a solution that is both scalable and cost-effective. With WEB-SHEPHERD, agents can now receive accurate feedback during navigation, enabling them to make better decisions and complete tasks more effectively.

Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces MMaDA: A Unified Multimodal Diffusion Model for Textual Reasoning, Visual Understanding, and Image GenerationNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Differentiable MCMC Layers: A New AI Framework for Learning with Inexact Combinatorial Solvers in Neural NetworksNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces GRIT: A Method for Teaching MLLMs to Reason with Images by Interleaving Text and Visual GroundingNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Group Think: A Token-Level Multi-Agent Reasoning Paradigm for Faster and Collaborative LLM Inference
#this #paper #introduces #webshepherd #process

This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10× Cost Efficiency
Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks are further complicated by the need for agents to adapt in dynamic web environments, where content can change frequently and where multimodal information, such as text and images, must be understood together. A key problem in web navigation is the absence of reliable and detailed reward models that can guide agents in real-time. Existing methods primarily rely on multimodal large language modelslike GPT-4o and GPT-4o-mini as evaluators, which are expensive, slow, and often inaccurate, especially when handling long sequences of actions in multi-step tasks. These models use prompting-based evaluation or binary success/failure feedback but fail to provide step-level guidance, often leading to errors such as repeated actions or missing critical steps like clicking specific buttons or filling form fields. This limitation reduces the practicality of deploying web agents in real-world scenarios, where efficiency, accuracy, and cost-effectiveness are crucial. The research team from Yonsei University and Carnegie Mellon University introduced WEB-SHEPHERD, a process reward model specifically designed for web navigation tasks. WEB-SHEPHERD is the first model to evaluate web navigation agents at the step level, using structured checklists to guide assessments. The researchers also developed the WEBPRM COLLECTION, a dataset of 40,000 step-level annotated web navigation tasks, and the WEBREWARDBENCH benchmark for evaluating PRMs. These resources were designed to enable WEB-SHEPHERD to provide detailed feedback by breaking down complex tasks into smaller, measurable subgoals. WEB-SHEPHERD works by generating a checklist for each task based on the user’s instruction, such as “Search for product” or “Click on product page,” and evaluates the agent’s progress against these subgoals. The model uses next-token prediction to generate feedback and assigns rewards based on checklist completion. This process enables WEB-SHEPHERD to assess the correctness of each step with fine-grained judgment. The model estimates the reward for each step by combining the probabilities of “Yes,” “No,” and “In Progress” tokens and averages these across the checklist. This detailed scoring system enables agents to receive targeted feedback on their progress, enhancing their ability to navigate complex websites. The researchers demonstrated that WEB-SHEPHERD significantly outperforms existing models. On the WEBREWARDBENCH benchmark, WEB-SHEPHERD achieved a Mean Reciprocal Rankscore of 87.6% and a trajectory accuracy of 55% in the text-only setting, compared to GPT-4o-mini’s 47.5% MRR and 0% trajectory accuracy without checklists. When tested in WebArena-lite using GPT-4o-mini as the policy model, WEB-SHEPHERD achieved a 34.55% success rate, which is 10.9 points higher than using GPT-4o-mini as the evaluator, while also being ten times more cost-efficient. In ablation studies, the researchers observed that WEB-SHEPHERD’s performance dropped significantly when checklists or feedback were removed, proving their importance for accurate reward assignments. They also showed that multimodal input, surprisingly, did not always improve performance and sometimes introduced noise. This research highlights the critical role of detailed process-level rewards in building reliable web agents. The team’s work addresses the core challenge of web navigation—evaluating complex, multi-step actions—and offers a solution that is both scalable and cost-effective. With WEB-SHEPHERD, agents can now receive accurate feedback during navigation, enabling them to make better decisions and complete tasks more effectively. Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces MMaDA: A Unified Multimodal Diffusion Model for Textual Reasoning, Visual Understanding, and Image GenerationNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Differentiable MCMC Layers: A New AI Framework for Learning with Inexact Combinatorial Solvers in Neural NetworksNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces GRIT: A Method for Teaching MLLMs to Reason with Images by Interleaving Text and Visual GroundingNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Group Think: A Token-Level Multi-Agent Reasoning Paradigm for Faster and Collaborative LLM Inference #this #paper #introduces #webshepherd #process

This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10× Cost Efficiency

www.marktechpost.com
Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, or booking services. Building a capable web navigation agent is a complex task because it requires understanding the structure of websites, interpreting user goals, and making a series of decisions across multiple steps. These tasks are further complicated by the need for agents to adapt in dynamic web environments, where content can change frequently and where multimodal information, such as text and images, must be understood together. A key problem in web navigation is the absence of reliable and detailed reward models that can guide agents in real-time. Existing methods primarily rely on multimodal large language models (MLLMs) like GPT-4o and GPT-4o-mini as evaluators, which are expensive, slow, and often inaccurate, especially when handling long sequences of actions in multi-step tasks. These models use prompting-based evaluation or binary success/failure feedback but fail to provide step-level guidance, often leading to errors such as repeated actions or missing critical steps like clicking specific buttons or filling form fields. This limitation reduces the practicality of deploying web agents in real-world scenarios, where efficiency, accuracy, and cost-effectiveness are crucial. The research team from Yonsei University and Carnegie Mellon University introduced WEB-SHEPHERD, a process reward model specifically designed for web navigation tasks. WEB-SHEPHERD is the first model to evaluate web navigation agents at the step level, using structured checklists to guide assessments. The researchers also developed the WEBPRM COLLECTION, a dataset of 40,000 step-level annotated web navigation tasks, and the WEBREWARDBENCH benchmark for evaluating PRMs. These resources were designed to enable WEB-SHEPHERD to provide detailed feedback by breaking down complex tasks into smaller, measurable subgoals. WEB-SHEPHERD works by generating a checklist for each task based on the user’s instruction, such as “Search for product” or “Click on product page,” and evaluates the agent’s progress against these subgoals. The model uses next-token prediction to generate feedback and assigns rewards based on checklist completion. This process enables WEB-SHEPHERD to assess the correctness of each step with fine-grained judgment. The model estimates the reward for each step by combining the probabilities of “Yes,” “No,” and “In Progress” tokens and averages these across the checklist. This detailed scoring system enables agents to receive targeted feedback on their progress, enhancing their ability to navigate complex websites. The researchers demonstrated that WEB-SHEPHERD significantly outperforms existing models. On the WEBREWARDBENCH benchmark, WEB-SHEPHERD achieved a Mean Reciprocal Rank (MRR) score of 87.6% and a trajectory accuracy of 55% in the text-only setting, compared to GPT-4o-mini’s 47.5% MRR and 0% trajectory accuracy without checklists. When tested in WebArena-lite using GPT-4o-mini as the policy model, WEB-SHEPHERD achieved a 34.55% success rate, which is 10.9 points higher than using GPT-4o-mini as the evaluator, while also being ten times more cost-efficient. In ablation studies, the researchers observed that WEB-SHEPHERD’s performance dropped significantly when checklists or feedback were removed, proving their importance for accurate reward assignments. They also showed that multimodal input, surprisingly, did not always improve performance and sometimes introduced noise. This research highlights the critical role of detailed process-level rewards in building reliable web agents. The team’s work addresses the core challenge of web navigation—evaluating complex, multi-step actions—and offers a solution that is both scalable and cost-effective. With WEB-SHEPHERD, agents can now receive accurate feedback during navigation, enabling them to make better decisions and complete tasks more effectively. Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces MMaDA: A Unified Multimodal Diffusion Model for Textual Reasoning, Visual Understanding, and Image GenerationNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Differentiable MCMC Layers: A New AI Framework for Learning with Inexact Combinatorial Solvers in Neural NetworksNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces GRIT: A Method for Teaching MLLMs to Reason with Images by Interleaving Text and Visual GroundingNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces Group Think: A Token-Level Multi-Agent Reasoning Paradigm for Faster and Collaborative LLM Inference

0 Commentarios ·0 Acciones ·0 Vista previa

Please log in to like, share and comment!
ArchDaily @ArchDaily Compartió un vínculo
2025-05-26 12:58:16 ·

From Smart to Intelligent: Evolution in Architecture and Cities

this picture!Algae Curtain / EcoLogicStudio. Image © ecoLogicStudio"The limits of our design language are the limits of our design thinking". Patrik Schumacher's statement subtly hints at a shift occurring in the built environment, moving beyond technological integration to embrace intelligence in the spaces and cities we occupy. The future proposes a possibility of buildings serving functions beyond housing human activity to actively participate in shaping urban life.The architecture profession has long been enamored with "smart" buildings - structures that collect and process data through sensor networks and automated systems. Smart cities were heralded to improve quality of life as well as the sustainability and efficiency of city operations using technology. While smart buildings and cities are still at a far reach, these advancements only mark the beginning of a much more impactful application of technology in the built environment. Being smart is about collecting data. Being intelligent is about interpreting that data and acting autonomously upon it.
this picture!The next generation of intelligent buildings will focus on both externalities and the integration of advanced interior systems to improve energy efficiency, sustainability, and security. Exterior innovations like walls with rotatable units that automatically respond to real-time environmental data, optimizing ventilation and insulation without human intervention are one application. Related Article The Future of Work: Sentient Workplaces for Employee Wellbeing Kinetic architectural elements, integrated with artificial intelligence, create responsive exteriors that breathe and adapt. Networked photovoltaic glass systems may share surplus energy across buildings, establishing efficient microgrids that transform individual structures into nodes within larger urban systems.Interior spaces are experiencing a similar evolution through platforms like Honeywell's Advance Control for Buildings, which integrates cybersecurity, accelerated network speeds, and autonomous decision-making capabilities. Such systems simultaneously optimize HVAC, lighting, and security subsystems through real-time adjustments that respond to environmental shifts and occupant behavior patterns. Advanced security incorporates deep learning-powered facial recognition, while sophisticated voice controls distinguish between human commands and background noise with high accuracy.Kas Oosterhuis envisions architecture where building components become senders and receivers of real-time information, creating communicative networks: "People communicate. Buildings communicate. People communicate with people. People communicate with buildings. Buildings communicate with buildings." This swarm architecture represents an open-source, real-time system where all elements participate in continuous information exchange.this picture!this picture!While these projects are impressive, they also bring critical issues about autonomy and control to light. How much decision-making authority should we delegate to our buildings? Should structures make choices for us or simply offer informed suggestions based on learned patterns?Beyond buildings, intelligent systems can remodel urban management through AI and machine learning applications. Solutions that monitor and predict pedestrian traffic patterns in public spaces are being explored. For instance, Carlo Ratti's collaboration with Google's Sidewalk Labs hints at the possibility of the streetscape seamlessly adapting to people's needs with a prototype of a modular and reconfigurable paving system in Toronto. The Dynamic Street features a series of hexagonal modular pavers which can be picked up and replaced within hours or even minutes in order to swiftly change the function of the road without creating disruptions on the street. Sidewalk Labs also developed technologies like Delve, a machine-learning tool for designing cities, and focused on sustainability through initiatives like Mesa, a building-automation system.Cities are becoming their own sensors at elemental levels, with physical fabric automated to monitor performance and use continuously. Digital skins overlay these material systems, enabling populations to navigate urban complexity in real-time—locating services, finding acquaintances, and identifying transportation options.The implications extend beyond immediate utility. Remote sensing capabilities offer insights into urban growth patterns, long-term usage trends, and global-scale problems that individual real-time operations cannot detect. This creates enormous opportunities for urban design that acknowledges the city as a self-organizing system, moving beyond traditional top-down planning toward bottom-up growth enabled by embedded information systems.this picture!this picture!While artificial intelligence dominates discussions of intelligent architecture, parallel developments are emerging through non-human biological intelligence. Researchers are discovering the profound capabilities of living organisms - bacteria, fungi, algae - that have evolved sophisticated strategies over millions of years. Micro-organisms possess intelligence that often eludes human comprehension, yet their exceptional properties offer transformative potential for urban design.EcoLogicStudio's work with the H.O.R.T.U.S. series exemplifies this biological turn in intelligent architecture. The acronym—Hydro Organism Responsive To Urban Stimuli—describes photosynthetic sculptures and urban structures that create artificial habitats for cyanobacteria integrated within the built environment. These living systems function not merely as decorative elements but as active metabolic participants, absorbing emissions from building systems while producing biomass and oxygen through photosynthesis. The PhotoSynthetica Tower project, unveiled at Tokyo's Mori Art Museum, materializes this vision as a complex synthetic organism where bacteria, autonomous farming machines, and various forms of animal intelligence become bio-citizens alongside humans. The future of intelligent architecture lies not in replacing human decision-making but in creating sophisticated feedback loops between human and non-human intelligence. The synthesis recognizes that our knowledge remains incomplete in any age, particularly as new developments push us from lifestyles constraining us to single places toward embracing multiple locations and experiences.this picture!The built environment's role in emerging technologies extends far beyond operational efficiency or cost savings. Intelligent buildings can serve as active participants in sustainability targets, wellness strategies, and broader urban resilience planning. The possibility of intelligent architecture challenges the industry to expand our design language. The question facing the profession is not whether intelligence will permeate the built environment. Rather, architects must gauge how well-positioned we are to design for this intelligence, manage its implications, and partner with our buildings as collaborators in shaping the human experience.This article is part of the ArchDaily Topics: What Is Future Intelligence?, proudly presented by Gendo, an AI co-pilot for Architects. Our mission at Gendo is to help architects produce concept images 100X faster by focusing on the core of the design process. We have built a cutting-edge AI tool in collaboration with architects from some of the most renowned firms, such as Zaha Hadid, KPF, and David Chipperfield.Every month, we explore a topic in-depth through articles, interviews, news, and architecture projects. We invite you to learn more about our ArchDaily Topics. And, as always, at ArchDaily we welcome the contributions of our readers; if you want to submit an article or project, contact us.
#smart #intelligent #evolution #architecture #cities

From Smart to Intelligent: Evolution in Architecture and Cities
this picture!Algae Curtain / EcoLogicStudio. Image © ecoLogicStudio"The limits of our design language are the limits of our design thinking". Patrik Schumacher's statement subtly hints at a shift occurring in the built environment, moving beyond technological integration to embrace intelligence in the spaces and cities we occupy. The future proposes a possibility of buildings serving functions beyond housing human activity to actively participate in shaping urban life.The architecture profession has long been enamored with "smart" buildings - structures that collect and process data through sensor networks and automated systems. Smart cities were heralded to improve quality of life as well as the sustainability and efficiency of city operations using technology. While smart buildings and cities are still at a far reach, these advancements only mark the beginning of a much more impactful application of technology in the built environment. Being smart is about collecting data. Being intelligent is about interpreting that data and acting autonomously upon it. this picture!The next generation of intelligent buildings will focus on both externalities and the integration of advanced interior systems to improve energy efficiency, sustainability, and security. Exterior innovations like walls with rotatable units that automatically respond to real-time environmental data, optimizing ventilation and insulation without human intervention are one application. Related Article The Future of Work: Sentient Workplaces for Employee Wellbeing Kinetic architectural elements, integrated with artificial intelligence, create responsive exteriors that breathe and adapt. Networked photovoltaic glass systems may share surplus energy across buildings, establishing efficient microgrids that transform individual structures into nodes within larger urban systems.Interior spaces are experiencing a similar evolution through platforms like Honeywell's Advance Control for Buildings, which integrates cybersecurity, accelerated network speeds, and autonomous decision-making capabilities. Such systems simultaneously optimize HVAC, lighting, and security subsystems through real-time adjustments that respond to environmental shifts and occupant behavior patterns. Advanced security incorporates deep learning-powered facial recognition, while sophisticated voice controls distinguish between human commands and background noise with high accuracy.Kas Oosterhuis envisions architecture where building components become senders and receivers of real-time information, creating communicative networks: "People communicate. Buildings communicate. People communicate with people. People communicate with buildings. Buildings communicate with buildings." This swarm architecture represents an open-source, real-time system where all elements participate in continuous information exchange.this picture!this picture!While these projects are impressive, they also bring critical issues about autonomy and control to light. How much decision-making authority should we delegate to our buildings? Should structures make choices for us or simply offer informed suggestions based on learned patterns?Beyond buildings, intelligent systems can remodel urban management through AI and machine learning applications. Solutions that monitor and predict pedestrian traffic patterns in public spaces are being explored. For instance, Carlo Ratti's collaboration with Google's Sidewalk Labs hints at the possibility of the streetscape seamlessly adapting to people's needs with a prototype of a modular and reconfigurable paving system in Toronto. The Dynamic Street features a series of hexagonal modular pavers which can be picked up and replaced within hours or even minutes in order to swiftly change the function of the road without creating disruptions on the street. Sidewalk Labs also developed technologies like Delve, a machine-learning tool for designing cities, and focused on sustainability through initiatives like Mesa, a building-automation system.Cities are becoming their own sensors at elemental levels, with physical fabric automated to monitor performance and use continuously. Digital skins overlay these material systems, enabling populations to navigate urban complexity in real-time—locating services, finding acquaintances, and identifying transportation options.The implications extend beyond immediate utility. Remote sensing capabilities offer insights into urban growth patterns, long-term usage trends, and global-scale problems that individual real-time operations cannot detect. This creates enormous opportunities for urban design that acknowledges the city as a self-organizing system, moving beyond traditional top-down planning toward bottom-up growth enabled by embedded information systems.this picture!this picture!While artificial intelligence dominates discussions of intelligent architecture, parallel developments are emerging through non-human biological intelligence. Researchers are discovering the profound capabilities of living organisms - bacteria, fungi, algae - that have evolved sophisticated strategies over millions of years. Micro-organisms possess intelligence that often eludes human comprehension, yet their exceptional properties offer transformative potential for urban design.EcoLogicStudio's work with the H.O.R.T.U.S. series exemplifies this biological turn in intelligent architecture. The acronym—Hydro Organism Responsive To Urban Stimuli—describes photosynthetic sculptures and urban structures that create artificial habitats for cyanobacteria integrated within the built environment. These living systems function not merely as decorative elements but as active metabolic participants, absorbing emissions from building systems while producing biomass and oxygen through photosynthesis. The PhotoSynthetica Tower project, unveiled at Tokyo's Mori Art Museum, materializes this vision as a complex synthetic organism where bacteria, autonomous farming machines, and various forms of animal intelligence become bio-citizens alongside humans. The future of intelligent architecture lies not in replacing human decision-making but in creating sophisticated feedback loops between human and non-human intelligence. The synthesis recognizes that our knowledge remains incomplete in any age, particularly as new developments push us from lifestyles constraining us to single places toward embracing multiple locations and experiences.this picture!The built environment's role in emerging technologies extends far beyond operational efficiency or cost savings. Intelligent buildings can serve as active participants in sustainability targets, wellness strategies, and broader urban resilience planning. The possibility of intelligent architecture challenges the industry to expand our design language. The question facing the profession is not whether intelligence will permeate the built environment. Rather, architects must gauge how well-positioned we are to design for this intelligence, manage its implications, and partner with our buildings as collaborators in shaping the human experience.This article is part of the ArchDaily Topics: What Is Future Intelligence?, proudly presented by Gendo, an AI co-pilot for Architects. Our mission at Gendo is to help architects produce concept images 100X faster by focusing on the core of the design process. We have built a cutting-edge AI tool in collaboration with architects from some of the most renowned firms, such as Zaha Hadid, KPF, and David Chipperfield.Every month, we explore a topic in-depth through articles, interviews, news, and architecture projects. We invite you to learn more about our ArchDaily Topics. And, as always, at ArchDaily we welcome the contributions of our readers; if you want to submit an article or project, contact us. #smart #intelligent #evolution #architecture #cities

From Smart to Intelligent: Evolution in Architecture and Cities

www.archdaily.com
Save this picture!Algae Curtain / EcoLogicStudio. Image © ecoLogicStudio"The limits of our design language are the limits of our design thinking". Patrik Schumacher's statement subtly hints at a shift occurring in the built environment, moving beyond technological integration to embrace intelligence in the spaces and cities we occupy. The future proposes a possibility of buildings serving functions beyond housing human activity to actively participate in shaping urban life.The architecture profession has long been enamored with "smart" buildings - structures that collect and process data through sensor networks and automated systems. Smart cities were heralded to improve quality of life as well as the sustainability and efficiency of city operations using technology. While smart buildings and cities are still at a far reach, these advancements only mark the beginning of a much more impactful application of technology in the built environment. Being smart is about collecting data. Being intelligent is about interpreting that data and acting autonomously upon it. Save this picture!The next generation of intelligent buildings will focus on both externalities and the integration of advanced interior systems to improve energy efficiency, sustainability, and security. Exterior innovations like walls with rotatable units that automatically respond to real-time environmental data, optimizing ventilation and insulation without human intervention are one application. Related Article The Future of Work: Sentient Workplaces for Employee Wellbeing Kinetic architectural elements, integrated with artificial intelligence, create responsive exteriors that breathe and adapt. Networked photovoltaic glass systems may share surplus energy across buildings, establishing efficient microgrids that transform individual structures into nodes within larger urban systems.Interior spaces are experiencing a similar evolution through platforms like Honeywell's Advance Control for Buildings, which integrates cybersecurity, accelerated network speeds, and autonomous decision-making capabilities. Such systems simultaneously optimize HVAC, lighting, and security subsystems through real-time adjustments that respond to environmental shifts and occupant behavior patterns. Advanced security incorporates deep learning-powered facial recognition, while sophisticated voice controls distinguish between human commands and background noise with high accuracy.Kas Oosterhuis envisions architecture where building components become senders and receivers of real-time information, creating communicative networks: "People communicate. Buildings communicate. People communicate with people. People communicate with buildings. Buildings communicate with buildings." This swarm architecture represents an open-source, real-time system where all elements participate in continuous information exchange.Save this picture!Save this picture!While these projects are impressive, they also bring critical issues about autonomy and control to light. How much decision-making authority should we delegate to our buildings? Should structures make choices for us or simply offer informed suggestions based on learned patterns?Beyond buildings, intelligent systems can remodel urban management through AI and machine learning applications. Solutions that monitor and predict pedestrian traffic patterns in public spaces are being explored. For instance, Carlo Ratti's collaboration with Google's Sidewalk Labs hints at the possibility of the streetscape seamlessly adapting to people's needs with a prototype of a modular and reconfigurable paving system in Toronto. The Dynamic Street features a series of hexagonal modular pavers which can be picked up and replaced within hours or even minutes in order to swiftly change the function of the road without creating disruptions on the street. Sidewalk Labs also developed technologies like Delve, a machine-learning tool for designing cities, and focused on sustainability through initiatives like Mesa, a building-automation system.Cities are becoming their own sensors at elemental levels, with physical fabric automated to monitor performance and use continuously. Digital skins overlay these material systems, enabling populations to navigate urban complexity in real-time—locating services, finding acquaintances, and identifying transportation options.The implications extend beyond immediate utility. Remote sensing capabilities offer insights into urban growth patterns, long-term usage trends, and global-scale problems that individual real-time operations cannot detect. This creates enormous opportunities for urban design that acknowledges the city as a self-organizing system, moving beyond traditional top-down planning toward bottom-up growth enabled by embedded information systems.Save this picture!Save this picture!While artificial intelligence dominates discussions of intelligent architecture, parallel developments are emerging through non-human biological intelligence. Researchers are discovering the profound capabilities of living organisms - bacteria, fungi, algae - that have evolved sophisticated strategies over millions of years. Micro-organisms possess intelligence that often eludes human comprehension, yet their exceptional properties offer transformative potential for urban design.EcoLogicStudio's work with the H.O.R.T.U.S. series exemplifies this biological turn in intelligent architecture. The acronym—Hydro Organism Responsive To Urban Stimuli—describes photosynthetic sculptures and urban structures that create artificial habitats for cyanobacteria integrated within the built environment. These living systems function not merely as decorative elements but as active metabolic participants, absorbing emissions from building systems while producing biomass and oxygen through photosynthesis. The PhotoSynthetica Tower project, unveiled at Tokyo's Mori Art Museum, materializes this vision as a complex synthetic organism where bacteria, autonomous farming machines, and various forms of animal intelligence become bio-citizens alongside humans. The future of intelligent architecture lies not in replacing human decision-making but in creating sophisticated feedback loops between human and non-human intelligence. The synthesis recognizes that our knowledge remains incomplete in any age, particularly as new developments push us from lifestyles constraining us to single places toward embracing multiple locations and experiences.Save this picture!The built environment's role in emerging technologies extends far beyond operational efficiency or cost savings. Intelligent buildings can serve as active participants in sustainability targets, wellness strategies, and broader urban resilience planning. The possibility of intelligent architecture challenges the industry to expand our design language. The question facing the profession is not whether intelligence will permeate the built environment. Rather, architects must gauge how well-positioned we are to design for this intelligence, manage its implications, and partner with our buildings as collaborators in shaping the human experience.This article is part of the ArchDaily Topics: What Is Future Intelligence?, proudly presented by Gendo, an AI co-pilot for Architects. Our mission at Gendo is to help architects produce concept images 100X faster by focusing on the core of the design process. We have built a cutting-edge AI tool in collaboration with architects from some of the most renowned firms, such as Zaha Hadid, KPF, and David Chipperfield.Every month, we explore a topic in-depth through articles, interviews, news, and architecture projects. We invite you to learn more about our ArchDaily Topics. And, as always, at ArchDaily we welcome the contributions of our readers; if you want to submit an article or project, contact us.

0 Commentarios ·0 Acciones ·0 Vista previa

Please log in to like, share and comment!

Upgrade to Pro