2011/09/16

QPRSUCOST: Software Design has more dimensions than 'Functionality'

Summary: There are multiple Essential Dimensions of Software Design besides "Functionality".

There are three Essential External Dimensions, {Function, Time, Money} and multiple Internal Dimensions.
I'm ot sure where/how "Real-Time" is covered, it isn't just "Performance": the necessary concurrency (not just "parallelism") and asynchronous events/processing require 10-100 times the cognitive capacity to deal with, and problems scale-up extraordinarily (faster than exponential) due to this added complexity. This is why Operating Systems and embedded critical systems (health/medicine, aerospace control, nuclear, Telecomms, Routers/Switches, Storage Devices, ...) are so difficult and expensive.

Not understanding and enumerating these multiple Dimensions whilst seemingly teaching Functionality only is perhaps currently the single biggest single failure of the discipline of Software Engineering.

The Necessary or Essential Dimensions of the further Software phases of Software Construction, Software Deployment and Software Maintenance besides the meta-processes of Software Project Management and I.T. Operations are beyond the scope of this piece.

This non-exhaustive taxonomy implies that there are additional Essential Dimensions, such as Maintainability and Manageability, elsewhere in the Computing/I.T. milieu.

My apologies in advance that this piece is in itself a first pass and not yet definitive.


+++ Need to deal with "Documentation" vs "Literate Programming" vs "Slices & tools"
+++ Dev - Ops. Infrastructure is part of the deliverable. Scripts on PRD/DEV/TST must be same. Software Config Mgt and Migration/Fail-back/Fail-over are different and essential/necessary



Software Design:
I'm using Software Design in an unconventional sense:
  everything that precedes and defines Coding and Construction.

While noting that Software Design and Construction are closely intertwined and inter-dependent and that all Software Projects are iterative, especially after notional Deployment and during Software Maintenance.

The acts of coding and testing uncover/reveal failings, errors, assumptions, blind-spots and omissions in the Design and its underlying models and concepts.

Where do the various Testing activities belong?
Wherever your Process or Project Methodology define them to be.
Many Software Design problems are revealed when first attempting to construct tests and later in performing them. Thus creating feedback, corrections and additional requirements/constraints.


What's an "Essential Dimension"?
In Formal Logic and Maths, there's the notion of "necessary and sufficient conditions" for a relationship or dependency to hold.
It is in this sense that I'm defining an "Essential Dimension" of elements or phases in the Software  process, that they individually be Necessary and together be Sufficient for a complete solution/result.
A Dimension is Essential if it's removal, omission or non-performance results in Defective, Incomplete, Ineffective, Non-Performing or Non-Compliant Software and Systems.
Or more positively, a Dimension is Essential if it must be performed to achieve the desired/specified process and product outputs and outcomes.
A marker of an Essential is

Defective, or colloquially "Buggy", Software, has many aspects, not just "Erroneous, Invalid or Inconsistent Results".

The term is meant to be parsed against each of the Essential Design Dimensions for specific meanings, such as "Hacked or Compromised" (Security), "Failure to Proceed or Complete" (i.e. crash or infinite loop: Quality), "Too Slow" (Performance),  "Corrupt or Lose Data" (Quality), "Unmaintainable" (Quality) and "Maxed out" (Scalability).


Initial candidate Essential Dimensions.
From my experience and observations of the full Software cycle and I.T. Operations, a first cut, not in order of importance:
  • F - Functionality
  • Q - Quality
  • P - Performance
  • R - Reliability/Recovery
  • S - Security/Safety
  • U - Usability 
  • C - Concurrency/Synchronousness
  • O - Operability/Manageability 
  • S - Scalability
  • T - Testability
Relative Importance of the Design Dimensions
Which Dimension is most important?
All and None: it depends on the specific project or task and its goals, constraints and requirements.

An essential outcome of the Specification phase of Software Design is to precisely define:
  • The criteria for each  Essential Design Dimensions for the Product, Project, all Tasks and every Component.
  • The relative importance of the Dimensions.
  • How to assess final compliance to these criteria in both Business and Technical realms.
The one universally applicable Design Dimension is Quality.

Which of its many aspects are critical for any project, sub-system, task, phase or component, and how they will be monitored, controlled and confirmed, must be defined by your meta-processes or derived through the execution of your Methodology.

Minimally, any Professionally produced Software component or product must be shown to conform both to the Zeroth Law requirements (keep running, terminate, Do no Damage  and produce results) and its written Functional Requirements/Specifications.


Quality


Zeroth Law requirements (keep running, terminate, Do no Damage and produce results)

From "The quality of software", Hoare, Software-Practice and Experience Vol 2, 1972 p103-5 

Hoare's Software Quality Criteria:
(1) Clear definition of purpose
(2) Simplicity of use
(3) Ruggedness
(4) Early availability
(5) Reliability
(6) Extensibility and improvability in light of experience
(7) Adaptability and easy extension to different configurations
(8) Suitability to each individual configuration of the range
(9) Brevity
(10) Efficiency (speed)
(11) Operating ease
(12) Adaptability to wide range of applications
(13) Coherence and consistency with other programs
(14) Minimum cost to develop
(15) Conformity to national and international standards
(16) Early and valid sales documentation
(17) Clear accurate and precise user’s documents

Security/Safety

Performance

Usability

Reliability/Recovery

Scalability

Testability
  • Functional Testing or Specification Compliance Testing?
  • Load Testing
  • Regression Testing, post-Release esp.
  • Acceptance Testing. Commercial Compliance?
  • Others?
Concurrency/Asynchronousity

Operability/Manageability

2011/09/14

A new inflection point? Definitive Commodity Server Organisation/Design Rules

Summary:

For the delivery of general purpose and wide-scale Compute/Internet Services there now seems to be a definitive hardware organisation for servers, typified by the E-bay "pod" contract.

For decades there have been well documented "Design Rules" for producing Silicon devices using specific technologies/fabrication techniques. This is an attempt to capture some rules for current server farms. [Update 06-Nov-11: "Design Rules" are important: Patterson in a Sept. 1995 Scientific American article notes that the adoption of a quantitative design approach in the 1980's led to an improvement in microprocessor speedup from 35%pa to 55%pa. After a decade, processors were 3 times faster than forecast.]

Commodity Servers have exactly three possible CPU configurations, based on "scale-up" factors:
  • single CPU, with no coupling/coherency between App instances. e.g. pure static web-server.
  • dual CPU, with moderate coupling/coherency. e.g. web-servers with dynamic content from local databases. [LAMP-style].
  • multi-CPU, with high coupling/coherency. e.g. "Enterprise" databases with complex queries.
If you're not running your Applications and Databases in Virtual Machines, why not?
[Update 06-Nov-11: Because Oracle insists some feature sets must run on raw hardware. Sometimes vendors won't support your (preferred) VM solution.]

VM products are close to free and offer incontestable Admin and Management advantages, like 'teleportation' or live-migration of running instances and local storage.

There is a special non-VM case: cloned physical servers. This is how I'd run a mid-sized or large web-farm.
This requires careful design, a substantial toolset, competent Admins and a resilient Network design. Layer 4-7 switches are mandatory in this environment.

There are 3 system components of interest:
  • The base Platform: CPU, RAM, motherboard, interfaces, etc
  • Local high-speed persistent storage. i.e. SSD's in a RAID configuration.
  • Large-scale common storage. Network attached storage with filesystem, not block-level, access.
Note that complex, expensive SAN's and their associated disk-arrays are no longer economic. Any speed advantage is dissolved by locally attached SSD's, leaving only complexity, resilience/recovery issues and price.
Consequentially, "Fibre Channel over Ethernet" with its inherent contradictions and problems, is unnecessary.

Designing individual service configurations  can be broken down into steps:
  • select the appropriate CPU config per service component
  • specify the size/performance of local SSD per CPU-type.
  • architect the supporting network(s)
  • specify common network storage elements and rate of storage consumption/growth.
Capacity Planning and Performance Analysis is mandatory in this world.

As a professional, you're looking to provide "bang-for-buck" for someone else who's writing the cheques. Over-dimensioning is as much a 'sin' as running out of capacity. Nobody ever got fired for spending just enough, hence maximising profits.

Getting it right as often as possible is the central professional engineering problem.
Followed by, limiting the impact of Faults, Failures and Errors - including under-capacity.

The quintessential advantage to professionals in developing standard, reproducible designs is the flexibility to respond to unanticipated load/demands and the speed with which new equipment can be brought on-line, and the converse, retired and removed.

Security architectures and choice of O/S + Cloud management software is outside the scope of this piece.

There are many multi-processing architectures, each best suited to particular workloads.
They are outside the scope of this piece, but locally attached GPU's are about to become standard options.
Most servers will acquire what were known as vector processors and applications using this capacity will start to become common. This trend may need their own Design Rule(s).

Different, though potentially similar design rules apply for small to mid-size Beowulf clusters, depending on their workload and cost constraints.
Large-scale or high-performance compute clusters or storage farms, such as the IBM 120 Petabyte system, need careful design by experienced specialists. With any technology, "pushing the envelope" requires special attention by the best people you have,  to even have a chance of success.

Not unsurprisingly, this organisation looks a lot like the current fad, "Cloud Computing" and the last fad, "Services Oriented Architecture".



Google and Amazon dominated their industry segments partly because they figured out the technical side of their business early on. They understood how to design and deploy datacentres suitable for their workload, how to manage Performance and balance Capacity and Cost.

Their "workloads", and hence server designs, are very different:
  • Google serves pure web-pages, with almost no coupling/communication between servers.
  • Amazon has front-end web-servers is backed by complex database systems.
Dell is now selling a range of "Cloud Servers" purportedly based on the systems they supply to large Internet companies.





An App too far? Can Windows-8 gain enough traction.

Summary:
"last to market" worked as a strategy in the past for Microsoft.
But "everything is a PC" is probably false and theyll be sidelined in the new Mobile Devices world.