Measuring Software Reliability of Interactive SystemsBy Youssef Edward
August 5, 2019
Every interactive system such as soft and hard real time systems define certain set of inputs and outputs. For some combination Set of inputs, there will be expected results of output. Failure for some set of input will result in strange system behavior that may lead to danger action in hard real time systems.
Software failure could be a result of software fault. Software fault is not the same as software failure software fault means a code or design error. This means that the program doesn’t comply to documentation. This could be discovered by program inspector. On the other hand, software failure means transient error during execution and the software may continue execution after that.
Software reliability relate to the number of software failures for one user for a specific environment and for a specific purpose. If for example, a software piece is said to be reliable of 99.99% for an aircraft if the flight takes 5 hours. This means that the flight will be damaged 1 of 10000 flights. This will not be accepted.
Given embedded system, measuring reliability of system by combining hardware and software reliability will be more difficult. This is because hardware faults are unrecoverable until repaired. This is like blown capacitor or open transistor. In this case, the system must be repaired first for the system to work. On the other hand, software faults take in transient manner. After they occur, the system may continue to go but may be in strange manner such as disable some options in the system while others work well.
Analyzing the system for reliability require supplying all possible set of inputs and measuring the output for each set. Failure for any set will affect the reliability of the software systems. The set of inputs supplied could be divided in:
1: Normal operation set
2. Exceptional operation set.
The first set defines the usual mode the user follows. For instance, he must enter the data to set the device, press start or stop in the right time to stop a device, etc.
The second set defines unusual conditions that may occur by the user but in very rare manner. For instance, the user may press stop for Processor based industrial power supply while the power is ramping or going up. In this interval, the power supply normally increase the power until reaching steady level and then polling to check if stop is pressed. Pressing stop while the power is soft started could be treated as abnormal condition.
Including abnormal conditions while estimating software failures is not a best practice and don’t make a big difference when including only the normal operations. Study revealed that it may be changed by only 3 % when including the abnormal conditions. This is because experienced users well tend to avoid the abnormal conditions and so the software quality will be good if the normal conditions are well fulfilled.