Exploring the Scalability of Distributed Computing Networks
Item
Poster Number
21
Poster Title
Exploring the Scalability of Distributed Computing Networks
First Presenter
Jacob Bushur
Abstract
Biology, chemistry, computer science, engineering, meteorology, and additional fields of study share the need to process large amounts of data. From protein folding simulations to sub-atomic chemical reaction simulations to analyzing large data sets, many fields of research require an increasing amount of computational power. There are currently two ways to meet this rising demand—supercomputers and distributed computing networks. Distributed computing networks offer a potentially cheaper alternative to supercomputers without sacrificing computing power. Distributed computing is based on the premise that many smaller computers can be networked together, forming a virtual supercomputer. To run a program on a distributed computing network, the network manager must first split the program into a collection of jobs. Each job is then sent to a computer connected to the network. The results are sent back to the central manager and presented to the user. These networks are typically created using specially developed software like the Berkeley Open Infrastructure for Network Computing (BOINC), Folding@Home, or HTCondor, and they are much cheaper to maintain due to their architecture. Since BOINC, Folding@Home, and HTCondor simply connect and manage computers over a network, operators almost entirely circumvent the typical hardware, electricity, space, and cooling costs associated with supercomputers. Additionally, computers on the network can perform valuable work while otherwise sitting idle. Thus, distributed computing networks harness the power of existing computing resources and grant access to significant computing power while averting the costs of a supercomputer. Both supercomputers and distributed computing networks offer substantial computing power to researchers in various fields. However, for many organizations and researchers, the costs associated with building and maintaining or renting a supercomputer make distributed computing networks a more attractive option. Given the benefits of distributed computing networks, there are two main goals for this project. The first goal is to explore the limits of the distributed computing network architecture in computationally intensive problems. Specifically, the research aims to quantify the marginal benefit of breaking a problem into smaller pieces and identify if there is a point where utilizing more computers fails to improve execution time. To complete this goal, two distributed computing networks are created. The first network consists of 16 Raspberry Pis. The second network consists of computers located in the Purdue University Fort Wayne Engineering, Technology, and Computer Science (ETCS) building. The second goal is to make the Purdue Fort Wayne distributed computing network available to faculty and students, providing future research access to significant computational power. Additionally, this network will be configured with BOINC allowing it to contribute to the World Community Grid, a distributed computing project dedicated to medical and environmental research when it would otherwise be idle.
Year
2022
Embargo
no embargo