dc.description.abstract | Social Media produces vast amounts of user-generated content (UGC) every second, and
images are increasingly part of enriching this content. The need for effective ways to organize
and categorize content is bigger than ever. The proliferation of Big Data also offer
new opportunities in regards to utilizing UGC in recommender systems. Considering the
noisy and unstructured nature of user-generated text however, extracting valuable knowledge
from it is not an easy task. Therefore, this thesis looks in the direction of images.
With the goal to extract some usable knowledge from these Social Media images, this
thesis proposes a novel approach to predicting the tags and content of an image from
Social Media with the help of deep convolutional neural networks (deep CNNs) and word
embedding models.
A pre-trained model for computer vision is used to classify an image and extract predictions
of its most likely content, and then evaluated against the image s tags to discover
the model s tag prediction ability. Each of the predictions are used to produce similar syntactic
and semantic information from a word embedding model. Using this aggregated information,
the model s prediction ability is re-evaluated and performances are compared.
In addition, the predictions are studied qualitatively to understand their degree of relevance.
The model is evaluated on a subset of the MIRFLICKR25000 data set, which consists
of 25000 images under the Creative Commons licence gathered from the Social Media
platform Flickr. Although image auto-tagging is thoroughly researched, the task of tag
prediction from images using computer vision and word embedding in this way is not
done previously. The evaluation of this model on the data subset shows that comparable
accuracy to state-of-the-art is achieved. Although they are not groundbreaking in terms of
accuracy, results show a significant increase when expanding queries using a word embedding
model. | |