Instrument Segmentation
of a Surgical Robot

2D Semantic Segmentation of a Robotic Surgical Scene

The Challenge

An ultimate goal for robotic surgery could be one where surgical tasks are performed autonomously with accuracy better than human surgeons. But several technical building blocks are essential to make this goal a reality. 

 

One such engineering problem is the segmentation and tracking of anatomical and surgical objects in real-time during the surgery. Solutions to these problems when achieved have many practical applications in the realms of surgeon training and developing patient safety tools when combined with augmented reality and virtual reality. 

 

In this case study, we developed multiple computer vision models to detect and track anatomical and surgical objects of interest using the state of the art CV models on annotated data provided by Intuitive Surgicals.

Approach

01

Data Preparation: To enhance the limited training data, a number of data augmentation techniques were applied to extract maximum information for the models to learn from.

02

Base Model Selection: Drawing from non-healthcare domain, we explored a number of Convolutional Neural Network architectures such as ResNet, Deep Lab V3, Fast RCNN, and UNet on various AI Platforms with Cost and speed to execution as key factors.

03

Model Training: We arrived at two different models one using Deeplab V3 (Tensor Flow) for accuracy of segmentation and one with Fast RCNN (PyTorch) to generate inferences to meet the real-time segmentation needs.

04

Hardware: A combination of Cloud hosted GPUs and Local GPU compute engines were used for data preparation, training and inference generation. With the elastic scaling capability of the hardware, we were able to tailor our computing costs to our needs.

No Test
05

Experiments: We tinkered with a number of base models including UNet and different AI platforms to arrive at an approach that allowed us to balance time/cost constraints and model accuracy.

Results

%

mIOU

on challenging scenes with sparse ground truth

%

mIOU

on easier scenes with dense ground truth

%

IOU

on certain anatomical objects

Conclusion

The data used in this effort was generated in Porcene labs and far from practical reality of human surgeries. Research in this area drives a need for large numbers of high quality annotated surgical videos and its meta data to have an adequate data for AI/ML model development. 

 

The current approach of supervised learning techniques are not scalable due to the time and effort it takes to label the anatomical objects of interest. Additionally, it is difficult to accurately label soft tissue in the data without enough domain expertise as a clinician or a surgeon. 

 

Any flaws introduced in the labelling process will carry over into the model accuracy. The Privacy concerns and the lack of incentives for the surgeons to provide high quality surgical data is a big inhibitor to make any meaningful progress. 

 

Model accuracy achieved is ok at best and far from good enough.

Featured Work

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

Featured Work

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Featured Work

Augmented Intelligence: A synergy between man and the machine

Augmented Intelligence: A synergy between man and the machine

Featured Work

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Featured Work

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Featured Work

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge