Depth Estimation of
Endoscopic View
Depth Estimation of Endoscopic Scene in Robotic Surgery
The Challenge

The Future of robotic surgery with the incorporation of AR/VR can lead to the development and deployment of novel safety tools that can augment the capabilities of surgeons. One of the key engineering building blocks to make this a reality is the ability to know the depth of different anatomical structures within the endoscopic view. This can lead to a better understanding of spatial relationships between different surgical and anatomical objects and provide options for the surgical team to minimize risk and allow human experts to stay in control.

In this case study, we leveraged the stereo view of the surgeon with both left and right views along with the metadata of the camera calibration provided by Intuitive Surgicals to develop a depth estimation model that generates the depth of each pixel within each frame. By utilizing only the stereo view of the surgeon, an interactive 3D view can be developed to be able to see the surgical scene from multiple perspectives, this will allow for the development of more accurate and useful safety tools.

Approach
01

Data Preparation: As a first step, the video was separated into frames, augmented the point cloud with a concept of locality based on depth. Rotation of the camera was then applied to the setpoint cloud to generate a depth map to be used for training.

02

Base Model Selection: We explored several base models for validating our data prep process and used UNet for this purpose. We iteratively augmented the UNet model to eventually arrive at PSMNet like architecture for generating the best model inferences.

03

Model Training: Adjusted the magnitude of the gradient as needed, masked the portions of the frames where we found bad data, used a mean average error loss, adam optimizer with weight decay and other parameters to iteratively assign weights to different samples based on the data at hand.

04

Experiments: We performed a number of experiments to generate a rotating 3D view of the surgical scene developed using the 2D frame and the depth map to be able to view the surgical scene from different perspectives.

Results
We won the 2019 Stereo Correspondence and Reconstruction of Endoscopic Data challenge with an average depth error of about 3mm. Because clinically significant registration of error is within 2mm, we may be close to a solution that can be deployed to assist surgeons in a clinically significant manner.
No Test
Conclusion
Our model won the competition with a mean average error of 3mm across 2 different test scenes. If a surgeon has access to 3D structures of a scene they will be able to see multiple perspectives. With that, any complex surgical task can perhaps be planned and executed more effectively, and the location of the camera will matter less. The results are encouraging, but we need more accurate depth data on varying surgical scenes to be able to build more precise and production-ready depth perception models.

Featured Work

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

Featured Work

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Featured Work

Augmented Intelligence: A synergy between man and the machine

Augmented Intelligence: A synergy between man and the machine

Featured Work

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Featured Work

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Featured Work

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge