Vis enkel innførsel

dc.contributor.authorRisvik, Knut Magnenb_NO
dc.date.accessioned2014-12-19T13:30:06Z
dc.date.available2014-12-19T13:30:06Z
dc.date.created2007-09-28nb_NO
dc.date.issued2004nb_NO
dc.identifier122755nb_NO
dc.identifier.isbn82-471-6317-9nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/249870
dc.description.abstractThis thesis focuses on methods and analysis for building scalable Internet Search Engines. In this work, we have developed a search kernel, an architecture framework and applications that are being used in industrial and commercial products. Furthermore, we present both analysis and design of key elements. Essential to building a large-scale search engine is to understand the dynamics of the content in which we are searching. For the challenging case of searching the web, there are multiple dimensions of dynamics that should ideally be handled. In this thesis we start by examining some of these dimensions and the implications they have on search engine design. When designing a search engine kernel, the focus has been on selection of algorithms and datastructures in the general case. Also, and even more important, we design worst-case characteristics into the search kernel that are very decisive from a scaling standpoint. A performance model to analyze the behaviour of the kernel is also developed. The designed search engine kernel was realized as a predecessor of the current FAST Search kernel (the FMS kernel), and practical experiments and benchmarking demonstrate the correctness of the assumptions from the design of the kernel. Then a framework for scaling shared-nothing systems based upon nodes working on separate portions of the data is introduced. The design of the framework is based on the general principles of replication and distribution. A performance model and an algorithm for cluster design are provided. This is in turn applied to construct a larger-scale web search engine and benchmarking of clusters indicate that the assumptions and models for the distributed architecture hold. The scaling aspect of search engine is further studied in the context of the application itself. Query locality is explored and used to create an architecture that is a generalized type of caching (through partial replication) using the application behaviour and a configurable correctness trade-off to design super-linear scalable search engines. Finally, a discussion of how linguistics are being used in web search engines is provided, focusing on the constraints that apply to ensure the desired scalability.nb_NO
dc.languageengnb_NO
dc.publisherFakultet for informasjonsteknologi, matematikk og elektroteknikknb_NO
dc.relation.ispartofseriesDoktoravhandlinger ved NTNU, 1503-8181; 2004:54nb_NO
dc.relation.haspartRisvik, Knut Magne; Michelsen, Rolf. Search Engines and Web Dynamics. .nb_NO
dc.relation.haspartRisvik, Knut Magne; Egge, Tor. The FMS Search Kernel and its Performance Characteristics. .nb_NO
dc.relation.haspartRisvik, Knut Magne; Svingen, Børge; Egge, Tor; Halaas, Arne. The FAST Distributed Processing Architecture (DPA) and its Application for a Larger-Scale Search Engine. .nb_NO
dc.relation.haspartRisvik, Knut Magne; Aasheim, Yngve; Lidal, Mathias. Multi-tier Architecture for web Search Engines. .nb_NO
dc.relation.haspartGulla, Jon Atle; Auran, Per Gunnar; Risvik, Knut Magne. Linguistics in Large-Scale Web Search. .nb_NO
dc.titleScaling Internet Search Engines - Methods and Analysisnb_NO
dc.typeDoctoral thesisnb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO
dc.description.degreeDr.philos.nb_NO
dc.description.degreeDr.philos.en_GB


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel