Robustness

System robustness is becoming increasingly important simply because our lives are becoming increasingly dependent on these systems and our expectations on system function is increasing. In its simplest form robustness means the ability operate unaffected or with slightly lower performance even if the environment is behaving according to specification or expected operational limitations. Robust design is usually quite complex and involves trade-offs to accomplish.

Designing robust systems is not trivial but depending on the consequences of failure can be well worth the investment, imagine a car with a fragile braking system? This would of course not be acceptable, hence cars have robust braking systems where partial failure leads to degraded performance rather than complete failure:

The strategies to achieve robustness can be many:
  • Temporary fix that allows degraded performance until the issue can be addressed. Eg: A lightweight tyre with strict speed limitations.
  • Slowly degrading systems that allows detection of issues before failure. Eg: As applied by many mechanical systems like brake disks.
  • System duplication or redundancies. Eg: hot standby server that can accept workload if the primary server fails.
  • Alternate algorithms used when the normal algorithms do not perform as expected. Eg: Simple and slow sorting algorithm that uses less memory for data sets that are to big for optimal processing in memory.

Robust system behaviour is not a plan B or change in implementation that is executed after a failure. Focus on building a robust system is the recognition that not all variations in usage can be known up front and an attempt to handle some of these variations. This is a part of designing for a great user experience, in its most basic form the aim of robustness is to ensure system availability.

Comments

Popular Posts