Vis enkel innførsel

dc.contributor.advisorSkramstad, Torbjørnnb_NO
dc.contributor.advisorAlsam, Alinb_NO
dc.contributor.authorSharma, Puneetnb_NO
dc.date.accessioned2014-12-19T13:41:21Z
dc.date.available2014-12-19T13:41:21Z
dc.date.created2014-09-30nb_NO
dc.date.issued2014nb_NO
dc.identifier750810nb_NO
dc.identifier.isbn978-82-326-0214-8 (printed ver.)nb_NO
dc.identifier.isbn978-82-326-0215-5 (electronic ver.)nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/253706
dc.description.abstractA salient image region is defined as an image part that is clearly different from its surround in terms of a number of attributes. In bottom-up processing, these attributes are defined as: contrast, color difference, brightness, and orientation. By measuring these attributes, visual saliency algorithms aim to predict the regions in an image that would attract our attention under free viewing conditions, i.e., when the observer is viewing an image without a specific task such as searching for an object. To quantify the interesting locations in a scene, the output of the visual saliency algorithms is usually expressed as a two dimensional gray scale map where the brighter regions correspond to the highly salient regions in the original image. In addition to advancing our understanding of human visual system, visual saliency models can be used for a number of computer vision applications. These applications include: image compression, computer graphics, image matching & recognition, design, and human-computer interaction. In this thesis the main contributions can be outlined as: first, we present a method to inspect the performance of Itti’s classic saliency algorithm in separating the salient and non-salient image locations. Based on our results we observed that, although the saliency model can provide a good discrimination for the highly salient and non-salient regions, there is a large overlap between the locations that lie in the middle range of saliency. Second, we propose a new bottom-up visual saliency model for static two-dimensional images. In our model, we calculate saliency by using the transformations associated with the dihedral group D4. Our results suggest that the proposed saliency model outperforms many state-of-the-art saliency models. By using the proposed methodology, our algorithm can be extended to calculate saliency in three-dimensional scenes, which we intend to implement in the future. Third, we propose a way to perform statistical analysis of the fixations data from different observers and different images. Based on the analysis, we present a robust metric for judging the performance of the visual saliency algorithms. Our results show that the proposed metric can indeed be used to alleviate the problems pertaining to the evaluation of saliency models. Four, we introduce a new approach to compress an image based on the salient locations predicted by the saliency models. Our results show that the compressed images do not exhibit visual artifacts and appear to be very similar to the originals. Five, we outline a method to estimate depth from eye fixations in three-dimensional virtual scenes that can be used for creating so-called gaze maps for three-dimensional scenes. In the future, this can be used as ground truth for judging the performance of saliency algorithms for three-dimensional images. We believe that our contributions can lead to a better understanding of saliency, address the major issues associated with the evaluation of saliency models, highlight on the contribution of top-down and bottom-up processing based on the analysis of a comprehensive eye tracking dataset, promote use of human vision steered image processing applications, and pave the way for calculating saliency in three-dimensional scenes.nb_NO
dc.languageengnb_NO
dc.publisherNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.relation.ispartofseriesDoktoravhandlinger ved NTNU, 1503-8181; 2014:146nb_NO
dc.relation.haspartAlsam, Ali; Sharma, Puneet. Analysis of eye fixations data. Proceedings of the 13th IASTED International Conference on Signal and Image Processing: 342-349, 2011.nb_NO
dc.relation.haspartSharma, Puneet; Ali, Alsam. A robust metric for the evaluation of visualsaliency models. Proceedings of the 9th International Conference on Computer Vision Theory and Applications: 654-661, 2014.nb_NO
dc.relation.haspartAlsam, Ali; Sharma, Puneet. Robust metric for the evaluation of visual saliency algorithms. Journal of the Optical Society of America A. 31(3): 532-540, 2014. <a href='http://dx.doi.org/10.1364/JOSAA.31.000532'>10.1364/JOSAA.31.000532</a>.nb_NO
dc.relation.haspartAlsam, Ali; Sharma, Puneet. Validating the Visual Saliency Model. Image Analysis ; 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings: 153-161, 2013. <a href='http://dx.doi.org/10.1007/978-3-642-38886-6_15'>10.1007/978-3-642-38886-6_15</a>.nb_NO
dc.relation.haspartAlsam, Ali; Sharma, Puneet; Wrålsen, Anette. Asymmetry as a Measure of Visual Saliency. Image Analysis ; 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings: 591-600, 2013. <a href='http://dx.doi.org/10.1007/978-3-642-38886-6_55'>10.1007/978-3-642-38886-6_55</a>.nb_NO
dc.relation.haspartAlsam, Ali; Sharma, Puneet; Wrålsen, Anette. Calculating Saliency Using the Dihedral Group <em>D</em> <sub>4</sub>. Journal of Imaging Science and Technology. (ISSN 1062-3701). 58(1): 10504-1-10504-12, 2014. <a href='http://dx.doi.org/10.2352/J.ImagingSci.Technol.2014.58.1.010504'>10.2352/J.ImagingSci.Technol.2014.58.1.010504</a>.nb_NO
dc.relation.haspartAlsam, Ali; Rivertz, Hans Jakob; Sharma, Puneet. What the Eye Did Not See – A Fusion Approach to Image Coding. Advances in Visual Computing ; 8th International Symposium, ISVC 2012, Rethymnon, Crete, Greece, July 16-18, 2012, Revised Selected Papers, Part II: 199-208, 2012. <a href='http://dx.doi.org/10.1007/978-3-642-33191-6_20'>10.1007/978-3-642-33191-6_20</a>.nb_NO
dc.relation.haspartAlsam, Ali; Rivertz, Hans Jakob; Sharma, Puneet. What the eye did not see--a fusion approach to image coding. International journal on artificial intelligence tools. (ISSN 0218-2130). 22(6), 2013. <a href='http://dx.doi.org/10.1142/S0218213013600142'>10.1142/S0218213013600142</a>.nb_NO
dc.relation.haspartSharma, Puneet; Nilsen, Jan Harald; Skramstad, Torbjørn; Alaya Cheikh, Faouzi. Evaluation of Geometric Depth Estimation Model for Virtual Environment. Norsk Informatikkonferanse NIK-2010: 166-177, 2010.nb_NO
dc.relation.haspartSharma, Puneet; Alsam, Ali. ESTIMATING THE DEPTH IN THREE-DIMENSIONAL VIRTUAL ENVIRONMENT WITH FEEDBACK. Proceedings of the 14th IASTED International Conference on Signal and Image Processing: 9-17, 2012.nb_NO
dc.relation.haspartSharma, Puneet; Alsam, Ali. ESTIMATING THE DEPTH UNCERTAINTY IN THREE-DIMENSIONAL VIRTUAL ENVIRONMENT. Proceedings of the 14th IASTED International Conference on Signal and Image Processing: 18-25, 2012.nb_NO
dc.titleTowards three-dimensional visual saliencynb_NO
dc.typeDoctoral thesisnb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.description.degreePhD i informasjonsteknologinb_NO
dc.description.degreePhD in Information Technologyen_GB


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel