Optimal computer crash performance precaution

Document type: Journal Articles
Article type: Original article
Peer reviewed: Yes
Full text:
Author(s): Efraim Laksman, Håkan Lennerstad, Lars Lundberg
Title: Optimal computer crash performance precaution
Journal: Discrete Mathematics and Theoretical Computer Science
Year: 2012
Volume: 14
Issue: 1
Pagination: 55-68
ISSN: 1462-7264
Publisher: Maison de l'informatique et des mathematiques discretes
Organization: Blekinge Institute of Technology
Department: School of Computing, School of Computing School of Engineering - Dept. of Mathematics & Natural Sciences, School of Engineering - Dept. of Mathematics & Natural Sciences (Sektionen för datavetenskap och kommunikation)
School of Computing S-371 79 Karlskrona
+46 455 38 50 00
Language: English
Abstract: For a parallel computer system withmidentical computers, we study optimal performance precaution for one possible computer crash. We want to calculate the cost of crash precaution in the case of no crash. We thus define a tolerance level r meaning that we only tolerate that the completion time of a parallel program after a crash is at most a factor r + 1 larger than if we use optimal allocation on m - 1 computers. This is an r-dependent restriction of the set of allocations of a program. Then, what is the worst-case ratio of the optimal r-dependent completion time in the case of no crash and the unrestricted optimal completion time of the same parallel program? We denote the maximal ratio of completion times f(r, m) - i.e., the ratio for worst-case programs. In the paper we establish upper and lower bounds of the worst-case cost function f(r, m) and characterize worst-case programs.
Subject: Computer Science\Computersystems
Keywords: Computer crash; Load balancing; Optimization; Parallel computer; Process allocation; Scheduling
Note: Open Access Journal