3D understanding plays an important role in advancing the ability of AI systems to better understand and operate in the real world – including navigating physical space in robotics, improving virtual reality experiences, and even recognizing occluded objects in 2D content.

But research in 3D deep learning has been limited because of the lack of sufficient tools and resources to support the complexities of using neural networks with 3D data and the fact that many traditional graphic operators are not differentiable.

Now PyTorch3D has been built and released by Facebook AI, is a highly modular and optimized library with unique capabilities designed to make 3D deep learning easier with PyTorch. PyTorch3D provides a set of frequently used 3D operators and loss functions for 3D data that are fast and differentiable, as well as a modular differentiable rendering API – enabling researchers to import these functions into current state-of-the-art deep learning systems right away.

PyTorch3D was recently a catalyst in Facebook AI’s work to build Mesh R-CNN, which achieved full 3D object reconstruction from images of complex interior spaces. We fused PyTorch3D with our highly optimized 2D recognition library, Detectron2, to successfully push object understanding to the third dimension. PyTorch3D functions for handling rotations and 3D transformations were also central in creating C3DPO, a novel method for learning associations between images and 3D shapes using less annotated training data.

Researchers and engineers can similarly leverage PyTorch3D for a wide variety of 3D deep learning research – whether 3D reconstruction, bundle adjustment, or even 3D reasoning – to improve 2D recognition tasks. Facebook are sharing PyTorch3D library and open-sourcing our Mesh R-CNN codebase.

Introduction

PyTorch3d provides efficient, reusable components for 3D Computer Vision research with PyTorch.

Key features include:

  • Data structure for storing and manipulating triangle meshes
  • Efficient operations on triangle meshes (projective transformations, graph convolution, sampling, loss functions)
  • A differentiable mesh renderer

PyTorch3d is designed to integrate smoothly with deep learning methods for predicting and manipulating 3D data. For this reason, all operators in PyTorch3d:

  • Are implemented using PyTorch tensors
  • Can handle minibatches of hetereogenous data
  • Can be differentiated
  • Can utilize GPUs for acceleration

See more HERE.

Related posts: