Implementation and optimization of ultrasound image classification and segmentation on a portable device based on deep learning

Vasilyev, Mikhail

Vasilyev, Mikhail

Master thesis

Åpne

18586_FULLTEXT.pdf (Låst)

18586_COVER.pdf (Låst)

18586_ATTACHMENT.zip (Låst)

Permanent lenke

http://hdl.handle.net/11250/2616053

Utgivelsesdato

2018

Metadata

Vis full innførsel

Samlinger

Institutt for teknisk kybernetikk [3739]

Sammendrag

Ultrasound imaging is a popular technology used in medicine for obtaining 2D and 3D images of body tissues. In echocardiography it is used to assess heart and detect potential cardiac diseases. Recent technological progress made this procedure mobile and available for even nonexpert users. Creating a software that would aid these users in interpreting results is therefore highly desirable. This involves implementing image processing on a portable device. This thesis is inspired by the idea that such offline processing can achieve adequate real-time performance and accuracy using deep learning, namely deep convolutional neural networks.

Ultrasound examination of patients is based on the principle of sending high-frequency ultrasound sound into body tissues and receiving the reflection. Characteristics of the received signal reveal information about the inner structure of the tissues. The initial signal is generated by a transducer that converts electrical energy to sound. A digital image can be processed in different ways. Classification means assigning a unique label to each image, while segmentation is the process of dividing the image into several non-overlapping sectors. Artificial neural networks is a method for superwised machine learning. Key elements are perceptrons that take one or several inputs and perform a mathematical function returning one output. A network consists of several layers of interconnected perceptrons and can be trained using a procedure called backpropagation. Convolutional networks is a type of neural networks specifically designed for image processing and featuring several special layer types like e.g. convolutional layer and pooling layer. The recent years have seen several new convolutional network architectures developed for image classification and segmentation. They include Mobilenet, CVC-net and U-net. There exist several software frameworks implementing engines for processing arbitrary architectures on portable devices. The most relevant among them are TensorFlow Mobile, TensorFlow Lite and Qualcomm SNPE.

In order to test performance of these frameworks using different network models, an Android application has been created. Furthermore, some additional code has been written to facilitate training of those networks for classification and segmentation of ultrasound images of left ventricle. Multiple different technologies and programming languages have been used including C++, Java and Python. Unique principles and routines have been adapted for working with each one of these technologies. The resulting code features optimization for performance and architecture with focus on modifiability and maintainability. The Android app can read images from nonvolatile memory or receive them via network. It can perform classification and segmentation and show results in real-time using different frameworks and network models that can be selected at runtime.

The obtained results consist of average processing times for different models and processing engines as well as accuracies of the trained convolutional networks. Of the classification models in test, MobilenetV1 and MobilenetV2 showed satisfying processing times, but the latter featured higher accuracy. The highest accuracy of 0.98 has been obtained using CVC-net, but that architecture failed to meet time requirements for real-time processing. Architectures tested for segmentation turned out to be incompatible with most of the tested frameworks, and showed unacceptably long processing times on the only one that was able to perform inference on them. Several other network architectures were tested in order to investigate dependences between model parameters and their performances using different runtime engines. The results led to the conclusion that the choices of framework and network architecture are not independent as models have different relative performances on different runtimes. An alternative solution has been discussed involving remote image processing using a web server. Despite having several advantages, it features an important decrease in availability of the system.

Utgiver

NTNU