I'm planning on documenting a framework that we built for managing non-functional requirements. This is post #2 of the series.
In Post #1, Last In - First Out: Building a Non-Functional Requirements Framework - Overview I outlined the template and definitions for our Non-Functional Requirements.
We also had to address outstanding audit findings that pointed out the lack of enterprise-wide security standards. Blank templates weren't going to cut it. The next steps were to create a generic set of Non-Functional requirements within each category, applicable to any system that we'd likely encounter. We then followed up with a structured, objective framework for applying the requirements to a particular system. The next few posts will cover these topics.
To make the NFR's re-usable and applicable to as many systems as possible, we created multiple Metrics within each NFR. Systems for which requirements could be relatively simple would be required to meet a lower Metrics, while systems for which requirements needed to be higher/stricter would meet the higher Metrics in the NFR. The Metrics were designed so that the very lowest level would be applicable to a single personal computing device with no stored confidential data, the highest Metric would be applicable to our largest system with the most confidential or financial data, and the in-between Metrics would be applicable to systems of varying levels of security and availability requirements in between the extremes. This allowed us to create a single Requirement applicable to many (or any) systems proportional to their relative value, and without subjecting low value systems to rigorous requirements.
Note that Availability, Performance and Reliability requirements found in other models are not requirements categories in our model. We determined that if a system met a set of Resiliency, Recoverability and Security requirements, the system would also meet an appropriate level of availability and reliability as a byproduct of the Resiliency, Recoverability and Security Requirements. Likewise, the system would be able to meet Performance requirements as a byproduct of scalability and maintainability requirements.
Usability, Portability and Compatibility are common requirement families in other models, but as the model was driven by short-term infrastructure and security needs, they were left out in the early phases
Keep in mind that these categories and requirements were designed to be usable in our environment - a public College and University system.
The categories and a high level description of the requirements in each category follow:
Category: Resiliency
Resiliency requirements describe the ability of the system to continue to function during common failure modes. A resilient system continues to work after routine failures (disk, server, OS or process). Resiliency is necessary to meet availability requirements and usability requirements. A resilient system may use technologies such as redundancy, clustering, load balancing, error handling, and error recovery to function after component failure. Resiliency encompasses the concepts of availability, reliability, robustness, fault tolerance and exception handling as described by other authors.
Our model references three Resiliency requirements - Hardware Resiliency, Software Resiliency, and Environmental Resiliency. Each requirement may have multiple levels with each metric.
Resiliency-Hardware Requirement: The ability of the system to continue business functionality upon physical failure of hardware components that make up the system.
Incorporates traditional concepts of Redundancy, Clustering, Load Balancing and Fault Tolerance. A systems 'Availability', RPO and RTO are derived from this and other requirements.
This requirement is intended to force the designer to leverage high availability technologies for systems in which the impact of an unavailable system reaches certain thresholds.
Resiliency-Software Requirement: The ability of the system to continue business functionality upon logical failure of software components that make up the system.
Incorporates traditional concepts of Redundancy, Clustering, Load Balancing and Fault Tolerance. A systems 'Availability', RPO and RTO are derived from this and other requirements.
In general, the designer should consider Resiliency – Software, and Resiliency – Hardware NFR’s as a unit and engineer for both NFR’s in concert. In particular, the software must be designed so as to gracefully manage both software and hardware failures using robust transaction management and error handling. Failure modes and failure domains must be well understood.
Resiliency - Environmental Requirement: They ability of systems to continue business functionality upon physical failure of site environmentals, including power, cooling, and related components.
Incorporates redundant power, cooling, uninterruptable power, generator backup. A systems 'Availability', RPO and RTO are derived from this and other requirements.
This NFR specifies that the facilities-related components that support the system have the appropriate level of recoverability and resiliency.
Designers should engineer for routine power and cooling failures and have appropriate back up power, alternate cooling, as necessary. Facilities failure domains such as power supplies, power distribution units, air conditioning units, etc. should be considered.
Category: Recoverability
Recoverability requirements that describe the ability to recover from failed states and return the system to its as-built condition. Using the example of a failed unit of hardware, a resilient system will continue to function after failure, while a recoverable system will have a simple and predictable method for recovering from the hardware failure. Data backups, data replication, hot-swap hard drives, and automated operating system and application deployment tools may be technologies or techniques to recover a failed component.
Our model references four Recoverability requirements: Component Recovery, Site Recoverability, Configuration Recovery and Logical Recovery. Each requirement may have multiple levels with each metric.
Recoverability-Component Requirement: The ability to repair or replace system components predictably, with minimum work effort, and with no loss or disruption of business functionality.
Incorporates traditional concepts of Configuration Management and Maintainability. Assures that components can be brought on line without maintenance windows.
While the resiliency NFR’s cover the behavior of systems when components fail, the recoverability NFR’s assure that the design of systems includes the ability to restore the system to its original, pre-failure state in a predictable manner.
To assure component recoverability, the designer needs to assure that the configuration of all system components is known, and that a means exists to create new components that are identical to existing components.
Recoverability-Site Requirement: The ability of the system to resume business functionality upon physical or logical failure of the site housing components of the system.
Incorporates traditional concepts of Disaster Recovery, site failover, site replication, off-site backups. A systems 'Availability', RPO and RTO are derived from this and other requirements.
This NFR sets the minimum Recovery Point Objective (RPO) and Recovery Time Objective (RTO) that systems must meet under site related failures, such as data centers, buildings and campuses.
Recoverability - Configuration Requirement: The ability of the system to resume business functionality upon logical failure of system metadata or system configuration information.
Incorporates traditional concept of change management (portions of), configuration management, test and back-out plans for planned configuration changes.
The intent of this NFR is to provide assurance that the system is designed and managed such that if any portion of the configuration of the system is modified for any reason, intentionally or not, the system can be recovered back to the state that it was in pre-modification. This is intended to discourage systems in which the configuration is ad-hoc, unstructured, or 'mouse driven', as compared to template or script driven configurations.
Recoverability - Logical Requirement: The ability of the system to resume business functionality upon logical failure of application managed business data.
Incorporates traditional concepts of database 'point in time recovery', file system snapshots and daily backups. A systems RPO is derived from this and other requirements.
This NFR is intended to assure that the system is designed so that after the data in a system has been modified outside of normal business practices (I.E logical file system or database corruption, poor configuration management, unauthorized data modification by either internal or external entities) the data managed by the systems can be recovered to a state at a point in time prior to the modification.
Category: Scalability:
Scalability requirements describe the ability to add and remove capacity to the system without affecting the availability of the system, while maximizing maintainability and constraining costs.
Scalability - Component Requirement: The ability to dynamically and cost effectively add or remove capacity by adding or removing hardware or software components.
Incorporates the traditional concept of 'Horizontal Scalability', load balancing and dynamic capacity management. Assures that systems are compatible with cloud technologies.
The intent of this NFR is to force systems into a horizontally scalable architecture, and to limit or prohibit designs that depend on large-scale hardware upgrades to scale to additional capacity. I.E systems must be designed to scale out, not scale up.
Category: Maintainability:
Our model has a single Maintainability Requirement. The requirement may have multiple levels with each metric.
Maintainability requirements describe the ability to maintain the system over its operational life. Among other attributes, a maintainable system can have routine hardware upgrades and application deployments without user affecting outages, it will have monitoring, logging and auditing sufficient for routine troubleshooting, it will have a low operational cost. Maintainability encompasses manageability, upgradability, deployability and flexibility as described by other authors.
Maintainability-Component Requirement: The ability to maintain the hardware, software and environmental components of a system without disrupting business functionality, and with minimal or no planned system outages.
Incorporates traditional concepts of Service Management, Change Management (portions of), Maintenance Windows and Continuous Maintenance. Assures that effect of system maintenance on users will be minimized.
This requirement forces the designer to consider the maintainability of the system as a part of the design process. The designer should select and configure components such that:
- Routine maintenance can be conducted on-line, using common technologies such as load balancing and clustering or equivalent.
- Application patches and upgrades can be implemented on-line.
- The release of new application functionality, including database schema changes, can be done on-line in many or most cases.
Category: Security:
The ability to maintain the confidentiality and integrity of a system and the data contain in or controlled by the system. Requirements related to system access, system integrity, system confidentiality and system configuration.
Our model references five Security Requirements - Configuration Integrity, Configuration Assessment, Data Classification, Data Encryption, and Awareness and Training.
Security - Configuration Integrity Requirement: The ability to determine the source of modifications to the logical and physical configuration of a system. Logging and auditing of configuration information and changes. The ability to prevent or detect unauthorized changes to configuration or data. The ability to respond to unauthorized access or modification of system configuration or data. The ability to determine the configuration of a system at an arbitrary point in time in the past.
Incorporates the traditional concepts of Configuration Management, Change Management (portions of), security auditing, Business Activity Logging, Intrusion Detection/Prevention and Malware Detection/Prevention, and security incident handling.
The intent of this requirement is to ensure that the system is designed so that:
- The system can support/enable least privilege and role based system configuration.
- Configuration changes are detectable. This implies that technologies such as routine, scheduled, continuous, or near-continuous configuration auditing.
- Auditing of changes in configuration creates an immutable audit trail, and the audit trail is properly secured.
- The configuration of a system can be recovered back to the state that the system was in prior to the modification.
Security - Configuration Assessment Requirement: The assurance that the initial configuration of the system is appropriately secure, that the system configuration is maintained in an appropriately secure state over the life of the system and that the state is verified and tested.
Incorporates the traditional concepts of system hardening, code review, Vulnerability Management, Pen Tests, Patch Management and least privilege for access and modification of system configuration.
The intent of this requirement is to ensure that systems are initially configured to a secure state, and that they remain in that state over the life of the system.
- The initial condition of the system is ‘hardened’ consistent with this requirement.
- A process or method must be implemented to ensure that the system is maintained in that state over its lifetime.
- The condition of the system is verified periodically, depending on the Level within the requirement, for example by using vulnerability scans of systems and application code.
- The application code is written and tested in accordance with a formal software development practice.
- Technologies, tools frameworks and libraries are implemented in a consistently secure manner.
Security - Data Classification Requirement: The classification of data consistent with State and Federal regulations and the assignment of data ownership.
Security - Data Encryption Requirement: The conditions under which data must be transported, transmitted and stored in an unreadable, encrypted format.
Incorporates the traditional concepts of protecting data using encryption such that the data is only readable by authorized individuals.
The intent of this requirement is to ensure transport layer security is implemented for data that is transmitted over a less trusted network, and that encryption is implemented for data at rest. Encryption of data at rest may include full disk encryption, database encryption, and/or encryption of backup media.
Security - Data Access Requirement: The ability to limit logical and physical access to systems and data to authorized individuals, the ability to limit modification of systems and data to authorized individuals, the logging and auditing system and data access, and the ability to alert on unauthorized access.
Includes traditional concepts such as account provisioning and management, account credentials, authorization, least privileged based data access, business activity logging and audit logging, security perimeters and perimeter controls.
The intent of this requirement is to limit access to data based on need-to-know to perform job duties and to alert on inappropriate access, and/or have an audit trail of access or activities (i.e. read, write, modify, delete) that can be traced to an individual.
Security - Awareness and Training Requirement: The assurance that system administrators are adequately skilled and knowledgeable in information security and the implementation, management and maintenance of systems for which they are responsible.
The intent of this requirement is to ensure system administrative personnel have the skills, knowledge and/or experience to effectively implement requirements defined by Federal or State law, regulations, contractual agreements, Policies, Procedures or other non-functional requirements.