New Apple AI model generates 3D scenes from just three images

поделился ссылкой

2025-05-14 10:38:16 -

Apple’s Machine Learning team, in collaboration with researchers from Nanjing University and The Hong Kong University of Science and Technology, has announced an interesting 3D AI model called Matrix3D.
This so-called Large Photogrammetry Model is able to reconstruct 3D objects and scenes from just a few 2D photos, but with a big difference from current pipelines.
Here’s why this is a big deal.
First things first: photogrammetry.
It uses photographs to make measurements in order to create 3D models or maps.
Currently, this process involves using different models for steps like pose estimation and depth prediction, which can lead to inefficiencies and errors.
Matrix3D simplifies this by doing it all in one go.
It takes in images, camera parameters (such as angle and focal length), and depth data, and processes them using a unified architecture.
This not only simplifies the workflow but also improves accuracy.
Even more interesting is how the model was trained.
Researchers used a masked learning strategy, very similar to early Transformer-based AI systems that helped pave the way for the first versions of ChatGPT.
They randomly hid parts of the input data during the training process, which forced Matrix3D to basically learn how to fill in the gaps.
This technique is key because it enables Matrix3D to train effectively even with smaller or incomplete datasets.
The results are impressive.
With just three input images, Matrix3D can generate detailed 3D reconstructions of objects and even entire environments, which obviously could have very interesting applications for immersive headsets like the Apple Vision Pro.
The researchers made the source code for Matrix3D available on GitHub, and published their paper on arXiv.
They also created a website where you can watch more sample videos and even interact with a few point cloud recreations of objects and environments.
Add 9to5Mac to your Google News feed.

FTC: We use income earning auto affiliate links.
More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day.
Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop.
Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
Source: https://9to5mac.com/2025/05/13/apple-study-3d-objects-from-images/" style="color: #0066cc;">https://9to5mac.com/2025/05/13/apple-study-3d-objects-from-images/
#new #apple #model #generates #scenes #from #just #three #images

New Apple AI model generates 3D scenes from just three images

Apple’s Machine Learning team, in collaboration with researchers from Nanjing University and The Hong Kong University of Science and Technology, has announced an interesting 3D AI model called Matrix3D. This so-called Large Photogrammetry Model is able to reconstruct 3D objects and scenes from just a few 2D photos, but with a big difference from current pipelines. Here’s why this is a big deal. First things first: photogrammetry. It uses photographs to make measurements in order to create 3D models or maps. Currently, this process involves using different models for steps like pose estimation and depth prediction, which can lead to inefficiencies and errors. Matrix3D simplifies this by doing it all in one go. It takes in images, camera parameters (such as angle and focal length), and depth data, and processes them using a unified architecture. This not only simplifies the workflow but also improves accuracy. Even more interesting is how the model was trained. Researchers used a masked learning strategy, very similar to early Transformer-based AI systems that helped pave the way for the first versions of ChatGPT. They randomly hid parts of the input data during the training process, which forced Matrix3D to basically learn how to fill in the gaps. This technique is key because it enables Matrix3D to train effectively even with smaller or incomplete datasets. The results are impressive. With just three input images, Matrix3D can generate detailed 3D reconstructions of objects and even entire environments, which obviously could have very interesting applications for immersive headsets like the Apple Vision Pro. The researchers made the source code for Matrix3D available on GitHub, and published their paper on arXiv. They also created a website where you can watch more sample videos and even interact with a few point cloud recreations of objects and environments. Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel Source: https://9to5mac.com/2025/05/13/apple-study-3d-objects-from-images/ #new #apple #model #generates #scenes #from #just #three #images

9TO5MAC.COM

New Apple AI model generates 3D scenes from just three images

Apple’s Machine Learning team, in collaboration with researchers from Nanjing University and The Hong Kong University of Science and Technology, has announced an interesting 3D AI model called Matrix3D. This so-called Large Photogrammetry Model is able to reconstruct 3D objects and scenes from just a few 2D photos, but with a big difference from current pipelines. Here’s why this is a big deal. First things first: photogrammetry. It uses photographs to make measurements in order to create 3D models or maps. Currently, this process involves using different models for steps like pose estimation and depth prediction, which can lead to inefficiencies and errors. Matrix3D simplifies this by doing it all in one go. It takes in images, camera parameters (such as angle and focal length), and depth data, and processes them using a unified architecture. This not only simplifies the workflow but also improves accuracy. Even more interesting is how the model was trained. Researchers used a masked learning strategy, very similar to early Transformer-based AI systems that helped pave the way for the first versions of ChatGPT. They randomly hid parts of the input data during the training process, which forced Matrix3D to basically learn how to fill in the gaps. This technique is key because it enables Matrix3D to train effectively even with smaller or incomplete datasets. The results are impressive. With just three input images, Matrix3D can generate detailed 3D reconstructions of objects and even entire environments, which obviously could have very interesting applications for immersive headsets like the Apple Vision Pro. The researchers made the source code for Matrix3D available on GitHub, and published their paper on arXiv. They also created a website where you can watch more sample videos and even interact with a few point cloud recreations of objects and environments. Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

0 Комментарии 0 Поделились 0 предпросмотр