What Happened to P2P Systems? A Special Focus on Content Distribution Systems
MetadataShow full item record
Peer-to-Peer (P2P) technology has emerged as a new distributed computing paradigm. P2P technology attempts to "harness the powers of the edges of the Internet" by making efficient and effective use of peers (users) at the "edge" of the Internet, by direct interaction between peers of the system. P2P architecture has witnessed lots of interest and research in the latest years because of the popularity of file-sharing applications based on it. Content distribution is an essential P2P application on the Internet that has received significant research consideration. Content distribution applications usually permits peer computers to work in a coordinated manner as a distributed storage system by sharing, searching, and retrieving digital content. A exploration of existing P2P systems and infrastructure technologies in terms of their distributed object location and routing protocols, their solution to data replication, migration and caching, their sustain for encryption, anonymity, identity and authentication, deniability, access control, accountability and reputation, and their utilization of resource trading and management schemes. In this work a review of the P2P concept is given together with presentation of various definitions of P2P computing drawn up by independent research institutes, as an attempt to study the filed covered by the P2P technology in which several distributed computing concepts have lately been tagged as P2P. A brief historical summary of P2P is provided as an attempt to summarize the story of P2P technology. A classification of P2P systems was necessary. There are mainly two core types of P2P network architectures. In the first type, pure, there is no central control or server managing the network (e.g. no central router) and the peers act as equals, fusing the roles of clients and server, whereas the second type, hybrid, has one or several entities, making this type of P2P networks have a single point of failure and bottlenecks problems. The use of Distributed Hash Tables (DHTs) to construct huge scalable P2P-systems, has verified to be a valuable building block in the development of scalable overlay networks, called DHT overlay networks. The DHT overlay networks, with its own resource location and routing mechanism, aim to reduce the routing cost, i.e. the number of hops needed to get a message from any source to any destination in the overlay, and to decrease the degree, that is, the number of neighbors which a node have to know about. In this report, the most prominent DHT overlay network protocols were investigated: Chord, Freenet, Tapestry and Content Addressable Networks (CAN), all having logarithmic bound on hop count (routing cost) and degree, or argue that logarithmic bounds can be realized with high likelihood. Five general P2P application areas, based on developments within P2P, are identified and presented. The progress within P2P has until now highlighted five general application areas, namely, collaboration, distributed computing, internet service support, distributed database and content distribution. Additional potential P2P applications may include; distributed file systems, multicasting applications, search engines, agent based systems, awareness systems, and mirror systems (clusters). The focus of this work was on P2P content distribution systems ranging from quite simple and uncomplicated direct file sharing applications, to more advanced and sophisticated systems that build a distributed storage medium for efficiently and securely publishing, management, searching, indexing, manipulating, and accessing content. The main non-functional features of P2P content distribution systems and infrastructures were identified, which consist of requirements for security, anonymity, fairness, improved scalability and better performance, in addition to resource management and organization capabilities. Two general categories of systems have been recognized in this manner: the unstructured and the structured systems (structured DHT-based systems). Latest research advances and developments in both structured and unstructured P2P systems, consider them both as complementary and suitable solutions. Investigations and studies on other features and characteristics of peer P2P content distribution systems have taken benefits of expertise and solutions from various scientific areas, such as, research on the application of reputation systems in P2P settings, the semantic management of information, in addition to the utilization of cryptographic mechanisms for the preservation of data, both stored and in passage through the network. As P2P technologies are still growing and evolving, there are a large number of open research areas, direction and opportunity, including, (1) The design and development of novel distributed object location and routing and DHT data structures (2) The investigation of more efficient security, anonymity and censorship resistance solutions. (3) The semantic organizing of information in P2P networks, (4) The design of incentive mechanisms and reputation approaches that will motivate the cooperative behavior among the users (5) The junction of P2P and Grid systems by combining the advantages of the established field of distributed technology with the qualities of new P2P architectures. Further research and future work on P2P may also include development of a common API for DHT overlays, development of application-level P2P benchmarks, degree and hop count tradeoffs, locality-aware DHT overlay networks, how to balance the load on all nodes in the network, full-text keyword search facility on DHT overlays, and data synchronizing and how to provide ACID properties for transactions. The use of DHTs to create scalable, fault-tolerant location and routing overlay networks and content distribution systems seems to be a step in the right direction. However, the P2P community still has a long way to go before P2P becomes a scalable, fault-tolerant, and secure infrastructure.