ISBN-13: 9783639163612 / Angielski / Miękka / 2009 / 148 str.
ISBN-13: 9783639163612 / Angielski / Miękka / 2009 / 148 str.
Sources of errors can be categorized into 4 different types: 1. Software bugs, 2. Human mistake during configuration and deployment of applications, and during maintenance of machines, 3. System hardware failures, and 4. Network problems.Software bugs have direct impact to system resource availability, for instance, memory leaks. Human mistakes usually result in decreacing of application availability. As thousands of computers connected together to form an application and to serve network traffic, hardware failures become common, such as RAID failures, file system issues, disk failed, etc. An example of Network problem is network switch failed. The ugly thing is that the switch usually is partially failed. Before this problem switch is identified, many other application timeouts, intermittent application availibility are already making people doing trouble shooting crazy. Embeded network issues sometimes are hard to identify.Error event filtering (in both temporal and spacial) can help to identify problems most of the time, and is helpful in trouble shooting. Once the errors are identified, modeling and failure prediction will come into play.