A Desktop Grid Computing Approach for Scientific Computing and Visualization
MetadataShow full item record
Scientific Computing is the collection of tools, techniques, and theories required to solve on a computer, mathematical models of problems from science and engineering, and its main goal is to gain insight in such problems. Generally, it is difficult to understand or communicate information from complex or large datasets generated by Scientific Computing methods and techniques (computational simulations, complex experiments, observational instruments etc.). Therefore, support of Scientific Visualization is needed, to provide the techniques, algorithms, and software tools needed to extract and display appropriately important information from numerical data. Usually, complex computational and visualization algorithms require large amounts of computational power. The computing power of a single desktop computer is insufficient for running such complex algorithms, and, traditionally, large parallel supercomputers or dedicated clusters were used for this job. However, very high initial investments and maintenance costs limit the availability of such systems. A more convenient solution, which is becoming more and more popular, is based on the use of nondedicated desktop PCs in a Desktop Grid Computing environment. Harnessing idle CPU cycles, storage space and other resources of networked computers to work together on a particularly computational intensive application does this. Increasing power and communication bandwidth of desktop computers provides for this solution. In a desktop grid system, the execution of an application is orchestrated by a central scheduler node, which distributes the tasks amongst the worker nodes and awaits workers’ results. An application only finishes when all tasks have been completed. The attractiveness of exploiting desktop grids is further reinforced by the fact that costs are highly distributed: every volunteer supports her resources (hardware, power costs and internet connections) while the benefited entity provides management infrastructures, namely network bandwidth, servers and management services, receiving in exchange a massive and otherwise unaffordable computing power. The usefulness of desktop grid computing is not limited to major high throughput public computing projects. Many institutions, ranging from academics to enterprises, hold vast number of desktop machines and could benefit from exploiting the idle cycles of their local machines. In the work presented in this thesis, the central idea has been to provide a desktop grid computing framework and to prove its viability by testing it in some Scientific Computing and Visualization experiments. We present here QADPZ, an open source system for desktop grid computing that have been developed to meet the above presented needs. QADPZ enables users from a local network or Internet to share their resources. It is a multi-platform, heterogeneous system, where different computing resources from inside an organization can be used. It can be used also for volunteer computing, where the communication infrastructure is the Internet. QADPZ supports the following native operating systems: Linux, Windows, MacOS and Unix variants. The reason behind natively supporting multiple operating systems, and not only one (Unix or Windows, as other systems do), is that often, in real life, this kind of limitation restricts very much the usability of desktop grid computing. QADPZ provides a flexible object-oriented software framework that makes it easy for programmers to write various applications, and for researchers to address issues such as adaptive parallelism, fault-tolerance, and scalability. The framework supports also the execution of legacy applications, which for different reasons could not be rewritten, and that makes it suitable for other domains as business. It also supports low-level programming languages as C/C++ or high-level language applications, (e.g. Lisp, Python, and Java), and provides the necessary mechanisms to use such applications in a computation. Consequently, users with various backgrounds can benefit from using QADPZ. The flexible object-oriented structure and the modularity allow facile improvements and further extensions to other programming languages. We have developed a general-purpose runtime and an API to support new kinds of high performance computing applications, and therefore to benefit from the advantages offered by desktop grid computing. This API directly supports the C/C++ programming language. We have shown how distributed computing extends beyond the master-worker paradigm (typical for such systems) and provided QADPZ with an extended API that supports in addition lightweight tasks and parallel computing (using the message passing paradigm - MPI). This extends the range of applications that can be used to already existing MPI based applications - e.g. parallel numerical solvers used in computational science, or parallel visualization algorithms. Another restriction of existing systems, especially middleware based, is that each resource provider needs to install a runtime module with administrator privileges. This poses some issues regarding data integrity and accessibility on providers computers. The QADPZ system tries to overcome this by allowing the middleware module to run as a non-privileged user, even with restricted access, to the local system. QADPZ provides also low-level optimizations, such as on-the-fly compression and encryption for communication. The user can choose from different algorithms, depending on the application, improving both the communication overhead imposed by large data transfers and keeping privacy of the data. The system goes further, by providing an experimental, adaptive compression algorithm, which can transparently choose different algorithms to improve the application. QADPZ support two different protocols (UDP and TCP/IP) in order to improve the efficiency of communication. Free source code allows its flexible installations and modifications based on the particular needs of research projects and institutions. In addition to being a very powerful tool for computationally intensive research, the open sourceness makes QADPZ a flexible educational platform for numerous smallsize student projects in the areas of operating systems, distributed systems, mobile agents, parallel algorithms, etc. Open source software is a natural choice for modern research as well, because it encourages effectively integration, cooperation and boosting of new ideas. This thesis proposes also an improved conceptual model (based on the master-worker paradigm), which makes contributions in several directions: pull vs. push work-units, pipelining of work-units, more work-units sent at a time, adaptive number of workers, adaptive time-out interval for work-units, and multithreading. We have also demonstrated that the use of desktop grids should not be limited to only master-worker applications, but it can be used for more fine-grained parallel Scientific Computing and Visualization applications, by performing some specific experiments. This thesis makes supplementary contributions: a hierarchical taxonomy of the main existing desktop grids, and an adaptive compression algorithm for remote visualization. QADPZ has also pioneered autonomic computing approach for desktop grids and presents specific self-management features: self-knowledge, self-configuration, selfoptimization and self-healing. It is worth to mention that to the present the QADPZ has over a thousand users who have download it (since July, 2001 when it has been uploaded to sourceforge.net), and many of them use it for their daily tasks (see the appendix). Many of the results have been published or are in course of publishing as it can be seen from the references.
PublisherFakultet for informasjonsteknologi, matematikk og elektroteknikk
SeriesDoctoral Theses at NTNU, 1503-8181; 2008:153
Showing items related by title, author, creator and subject.
Security in Cloud Computing: A Security Assessment of Cloud Computing Providers for an Online Receipt Storage Blakstad, Kåre Marius; Andreassen, Mats (Master thesis, 2010)Considerations with regards to security issues and demands must be addressed before migrating an application into a cloud computing environment. Different vendors, Microsoft Azure, Amazon Web Services and Google AppEngine, ...
Computing Almost Split Sequences: An algorithm for computing almost split sequences of finitely generated modules over a finite dimensional algebra Lian, Tea Sormbroen (Master thesis, 2012)An artin algebra $l$ over a commutative, local, artinian ring $R$ was fixed, and with this foundation some topics from representation theory were discussed. A series of functors of module categories were defined, and almost ...
Government cloud computing: requirements, specification and design of a cloud-computing environment for police and law enforcement Tellefsen, Andreas (Master thesis, 2012)ENGELSK: As the amount of information found at digital forensic crime scenes increases each year, conventional methods for evidence handling and analysis come under pressure to perform at a sufficient level. New techniques ...