2007/12/29

IBM, Outsourcing and the IT Profession

This is a reaction to Robert X. Cringely's "Pulpit" of 28-Dec-2007:

Leaner and Meaner Still: IBM's U.S. operations continue to shrivel.

There are 3 parts to my comments:
  1. Will IBM Survice?
  2. Outsourcing
  3. IT as a Profession

They are interlinked. Lou Gertsner set IBM on the road on "Services" and away from Mainframes. It looked promising.
IT Services look very appealing on the Balance Sheet - nearly no investment (no tangible assets) and what seem to be good profits from turnover. The ROA and ROI (Return on Assets and Return on Investment) look great - until you take some other factors into account.

  • Barriers to Entry for competitors are low.
    EDS under Ross Perrot came from nowhere to define and dominate the field - so can the next giant in the field.
    If your business model is "hire cattle and drive them till they drop" - you have no market differentiation.
    Same cattle, same drivers, same pay - same 'ol, same 'ol... The cattle aren't loyal, motivated or engaged.
    Writing new contracts is a matter of perception, influence and contacts.
    There is so much feeling against IT Outsourcers in business at the moment, the first company to come along and tell a better story will take the field.
    The change won't be overnight, but fast enough that the incumbents won't notice until too late.

  • Whilst only tangible assets appear on the Balance Sheet, IT Services are driven by your Human Capital and some Intellectual Capital embodied in your processes, branding and IP, such as trademarks and patents.
    What value is let in the offices when everybody has gone home? Very, very little.
    What is the business risk of a large, sudden exodus of your staff? A competitor may deliberately poach enough to put you in trouble.
    It's a failing of the Board not understand this and institute appropriate metrics, accounting and management rewards.

  • Profit based on Operations turnover are very fragile/volatile.
    Income and Expenses are very large numbers with a small difference. Expenses are mainly employees - which you may not be able to shed as quickly as service contracts expire.
    Tendering for new contracts implies you have, or can quickly get, the resources to fulfill the contract. That's an extreme business risk.
    The key figures-of-merit are Income/Employee and Profit/Employee.
    We don't see those reported or obviously managed.

  • IT Services work is Knowledge Work - it is mostly invisible and intangible.
    Driving IT staff like unskilled labourers with threats/punishment to lift performance is anti-productive.
    Unhappy staff withdraw and pushback. At best they aren't engaged or motivated. They 'do the minimum' - a grudging compliance.
    They stop caring about their work, the customer and their employer. And if you are lucky, it stops there.
    Hiring bright, capable people doing intangible work and treating them badly is not just a recipe for disaster, it is foolishness writ large.

  • IT is a cognitive amplifier and this can be leveraged both within the business and internally in IT.
    The only sustainable strategy to deliver improved profits is through investment.
    • Automating tasks.
      Applying our own technology to our jobs to make tasks, not jobs, redundant.
      Investing in tools and hardware to increase the both Quality of work
    • Building Human Capital.
      Investing in the people at the work-face to build their capability and performance.
      The SEI's Barry Boehem created COCOMO - a quantitative model for estimating Software costs.
      Experienced, competent practitioners not only produce better work, fewer defects, faster - they are cheaper.
    • Actively reducing Errors.
      Consciously reducing waste, rework and wrong work.
      Quality is not about 'doing the minimum', it's a mindset where Errors are allowed, but their repetition is anathema.
      High Quality performances are only achieved with deliberate, focussed intention. Not blaming and denial.
      Quality Systems only goal is to make it difficult for good people to make mistakes.
      Deming said it all with "Plan-Do-Check-Act", or in new-speak: "Preparation - Execution - Review & Evaluation - Improvement"
    • Learning is central to improving Quality, Performance, Security & Safety and Usability.
      Learning systems, processes and procedures takes an investment of time, tools and technology.
      Failing to build teams and their capability will decrease expenses in the short-run and will increase them in the long-run.

  • Resiling from the classic adversarial stance of IT Outsourcing.
    IT is a business enabler. It is now central to normal business operations. It is still where 80% of efficiency improvements arise.
    Every act that hurts the client will turn-around and hurt the provider, but more.
    The client is earning the income that pays for the IT.
    More income, more IT, more outsourcing revenue and profits. A simple equation that seems lost on Outsourcing managers.
    Outsourcing contracts need to align the internal management rewards with improving business outcomes for the Client.
    What's anathema to current management - reducing Client costs - must be aggressively pursued to create a long-term Outsourcing business.

Quality is not 'gold-plating' - it is central to improving productivity, reducing waste and fulfilling customer expectations. These are the drivers for growth, profitability and sustainability - not penny-pinching and cost-cutting.

IT Services companies cannot, and will not, pursue Excellence & Quality if they are not driven to it.
It is only their Clients who can hold them accountable and force a change.

Concurrently, IT has to evolve from an Industry to a Profession so that managers can realistically evaluate the performances of different practitioners. It's not hard to win new business and make good profits if your employees are 10 times more productive than your competitions.


Will IBM Survive?


Answering the poll question: Will IBM survive?

Lou Gertsner turned IBM around, starting 1993.
It took an outsider to do it - and the board knew that.

His legacy, after leaving in 2002, should've been a company with a solid future. Five years on, it appears not so - that can only be "Corporate Culture".

IBM is far too important to be let fail and broken up in a firesale.

But we have a perfect model for the future of Cringely's "lumbering giant": Unisys.

In 1986, Numbers 2&3 in the market (Burroughs & Sperry Univac) combined and produced a dud. It's still alive, but failing. Because enough people use their mainframes (2200's and A-series), they can't be allowed to die. Slowly withering on the vine seems to be fine.

Fujitsu is the perfect vacuum-cleaner to buy the hardware business in the final break-up.


Outsourcing


IBM GSA and the other 'Tier 1' outsourcers operate from the same playbook - a version of 'bait and switch'. Also known as "The Value Prevention Society".

I've worked with and for all the major outsourcers in Australia. They all bid low to win contracts and adopt a dual strategy of "controlling costs" and price gouging for "variations".

'Controlling costs' is reducing staff, replacing competent staff with 'cheap and cheerful' newbies, not performing maintenance and avoiding capital investment.

What's wrong with a 5-10 year-old system? Nothing if you don't have to suffer the performance and other problems!

They routinely ignore contract provisions - like scheduled roll-outs of new desktops, upgrades and system performance targets.

The problems are at least three-fold:
- inequality of parties (Outsourcer vs Client)
- internal 'manager' performance has no upside, only downside
- no impartial umpire and effective 'stick' to enforce system performance targets

Inequality
Every company that signs an IT Outsourcing agreement signs just one. The outsourcers has done this many, many times before.
Clients also don't factor in the increased staff and reporting costs - each side needs additional staff for 'contract management'.

The Client thinks it has stitched up an iron-clad contract and they forecast a bountiful harvest... Which doesn't happen.

Service degrades, minor works become hugely expensive, major works take forever and often don't get implemented.
The business people give-up and adapt around it.

In Australia, all the major EDS contracts let around 10 years ago are now being re-tendered - with EDS getting very little of the new work.
Are they the worst? Hard to say...

Aligning internal rewards with Client Needs
Outsourcer 'managers' can only be assessed on monetary performance. With fixed price contracts, base income is fixed.

If a manager reduces costs 5% one year, this becomes the expectation for every following year - it is not seen as a 'one-off'. Without significant staff training and capital expenditure, this quickly becomes impossible without sacrificing service quality. Commercial systems are quite reliable these days. For existing stable systems, 'Do nothing' is good for at least 3 years - then you are in deep trouble.

The only ways to increase profits are to reduce expenses or increase non-base income.

Every service request is deemed a 'change' and subject to the full, heavyweight, project evaluation methodology. No project, not even buying a simple standalone appliance, takes under 4 man-weeks ($20-50,000). For the client, this stifles change/innovation (or forces it underground) and these additional costs overshadow most systems costs.

Capital expenditures are worse. Payback has to be within 12-18 months - and it has to beat 'do nothing'.
Since the 2003 slowdown in Moores' Law for CPU speed, the problem has compounded.

Take a 5 year-old file server that is now close to saturated most of the day. It is not yet 'end of life' and maintenance costs still low.
Because file open/close, read/write performance is not specified and the system is "available" during work hours, the Client cannot complain.
The Operating System (O/S) may be old and need constant attention, updates and reboots - but they are part of the normal admin workload, so not an 'additional' cost. Salaried staff as 'professionals' must work any unpaid overtime that is demanded.

Any proposal to replace the server or upgrade it has to pass a simple, and reasonable, test:
How much extra revenue will we make? How long will the payback period be?

'Do nothing' is the benchmark - for zero capital expenditure and a few extra unpaid admin hours, a service is provided that brings in the service full revenue - and will continue to do so. That's a very tough argument to beat.

Only when the client funds the replacement, hardware maintenance costs are high enough, an O/S upgrade is required for security or compatibility or qualified admin staff move on will the system be upgraded. And then it will begin the same inevitable slide into entropy and uselessness.

Finding solutions that benefit the customer and reduce operating expenses are career suicide for outsourcing staff in a culture focussed on increasing billables.

For example: a major Australian bank replaced all the local file servers with small Network Appliance NAS's. These are the most expensive product per Gb available. The outsourcer had charged ~$2,500/month to 'administer' these systems. The bank paid for the change in under a year, increased availability and performance and solving many other issues to boot.

If the client gives all its IT staff to the outsourcer, who is going to seek out, design and implement new cost saving technology/systems?
Not the outsourcer - it's not in the contract and not in its (short term) interests.
The client has no IT staff - so it cannot and doesn't happen.

Audits and an Impartial Umpire

Who reports to the Client on the performance of their systems?
Who has the training/qualifications to check and asses the metrics and reports?
Who maintains & audits the basis of payments - the asset register?

Only the Outsourcer.

What are the downsides to the Outsourcer of a major failure in Prime Time?
A small number of 'service credits'.
Meanwhile, the Client suffers real costs and potentially large losses.

The Client wears all the business and financial risk with only minor penalties to the Outsourcer.
We are yet to see a corporate collapse due to an outsourcers IT failures - but it is only a matter of time.

There is a clear conflict of interest, or an real Agency Theory problem.
The outsourcer is Judge, Jury and Executioner...
There is no way to hold them to account or dispute their figures.


The Profession of IT

Contributors Michael Ellis, BJ, Kevin James, Richard Steven Hack,... started a thread about the 'value'/competency of individual IT practitioners.

The huge (100+:1) variability in individual competence and the inability to measure it is one of the worst problems in our industry.

IT is not a 'Profession'. It, like 'Management', fail a very simple test:

What are the personal and organisational consequences of repeating, or allowing to be repeated, a known error, fault or failure??
[Do your mistakes have clear 'consequences' professionally?']

Mostly it is "fire/blame the innocent, promote the guilty". The exact inverse of what you'd want.
People may trump technology and process, but Politics trumps everything...

And our Professional Bodies don't help.

The only real research into the causes of Project Failure are by consultancies - who are driven by the ability to sell their products, not what will benefit the Profession.

The ACM, IEEE, IFIP and friends have abrogated their responsibilities. We on the firing line, get to suffer their inaction.

Managers have to go with what they can quantify and inspect. Good managers will see through the B/S - but mostly too little, too late. Mostly, office politics, influence and self-promotion rule.

The adversarial nature of Outsourcing and the seemingly universal decline in code and service Quality stems from this failure of IT as a Profession.



Steve Jenkin 29-December-2007

2007/10/20

MinWin: 25Mb WIndows. Hypervisor expected?

Could this be the start of a real change at MSFT? [i.e. doing software 'properly' (small, fast, secure)]

First question:
  • What if they pick up GPL or similar code & illegally include it.
  • How would that be detected??

Timeframe for 'commercial' MinWin is 2010.
The real news here is MSFT's focus on virtualisation...
With their purchase of "Virtual PC", they have the tools to build their next-gen O/S products around VM's.
Licensing???

Another question:
  • If MS kernels ship with a hypervisor, how do we dual-boot or run our own VM like XEN?
  • Would they be stupid enough to "embrace & extend" the VMI API/paravirt_ops?

The actual talk was on virtualisation and its impacts.. [143Mb]
<http://www.acm.uiuc.edu/conference/2007/video/UIUC-ACM-RP07-Traut.wmv>


<http://blogs.zdnet.com/microsoft/?p=842>

Traut spent most of his time describing Microsoft’s thinking around virtualization, and how virtualization can be used to ease backwards compatibility and other problems Windows users incur.


Microsoft has created a stripped-down version of the Windows core, called MinWin, that will be at the heart of future Windows products, starting with Windows 7, the Windows client release due in 2010.


MinWin is 25 MB on disk; Vista is 4 GB, Traut said. (The slimmed-down Windows Server 2008 core is still 1.5 GB in size.)

but no graphics subsystem


The MinWin core is 100 files total, while all of Windows is 5,000 files in size.

Runs in 40Mb memory. Ascii only

MinWin will be at the heart of future versions of Windows Media Center, Windows Server, embedded Windows products and more.

First good MSFT decision I've heard in ages

Traut said he is running a team of 200 Windows engineers working on the core kernel and Windows virtual technologies.

C.f. 10,000 total on Longhorn/Vista. Say 3,000 coders

(he) said that Microsoft is operating under the premise that “at some point, we’ll have to replace it (the kernel),” given that it “doesn’t have an unlimited life span.

That's important news

2007/10/05

Open Source - Barriers to Entry

Open Source - Barriers to Entry


I think I have a short, coherent description of the underlying cause of
the barriers to adoption to Open Source:

"Some Thinking/Expertise Required"
(as in "Some Assembly Required" or "Batteries not included")

It stems from:
Is IT well-managed?

Which leads to:
Is "Mangement" generally practiced well??

To both of these, my answer is a strong "NO" - it's all about failure of
management.

The Usual Management Method


I've seen very consistent behaviours, attitudes and approaches across
every organisation I've worked in [a very large cross-section]. I don't
know where they arise or how - but there are best described as
'unschooled' or 'hard knocks'.
Certainly not 'insightful', educated nor informed... That appears to be
anathema.

I've met precious few managers that I'd call competent, let alone good.
And very few who'd bothered to train in their work.
One (a scientist in charge of 100 ppl and a $30M budget) bragged "I've
never done *any* management training".
His PhD in biology qualified him for everything...
[The subtitle of "Other Peoples Money" is: 'Arrogance, Ignorance and
Self-Delusion'. Wide-spread and

Perhaps this one point, consistent management training, is the reason
IBM dominated the computing industry for 3 decades...
[And their avarice/denial brought them undone]

Professional Management & Management Profession


'Management' doesn't qualify as a 'Profession' under my definition:
  • an identified & testable set of practices, skills, competencies
  • (behaviour?) [think pilot or surgeon]
  • means to provide barriers to entry and disqualification/discipline
  • Improvement/Learning mechanisms:
  • by invention/discovery
  • by incremental improvement
  • analysis of failure & root-cause analysis + corrective actions (think 'bridge falling down' or plane crash)

IT Management and general Management


Without Professional & competent business managers, there can be good
management of IT.
Without good IT management, good practices and competent practitioners
are rare and can't be maintained...

Summary:
IT is populated mostly by a bunch of rowdy, undisciplined 'cowboys'
that are set in their ways and do what they please.
IT management is about politics, influencing and pleasing, not any
rational, objective measures.

That explains the Fads & Fashions of Management, and the almost
universal CIO mantra "nobody got fired for buying <fad/fashion>".
And of course:
  • Risk Avoidance & Blame Shifting [consultants & outsourcers]
  • CYA

Implications for Open Source business


How to use this premise?
  • Wait for the current fashion to collapse [or have cracks]. All fads & fashions change.
  • Find the few competent business & IT managers out there and sell to them...
  • Sell them camouflaged/disguised systems - like embedded devices or appliances (e.g. network, storage, security)

2007/06/07

Why IPTV won't work in Australia - ISP Volume Charging.

It's all about cost...

A phone line costs $25-$35 just to have. Add ADSL plans on top.

ADSL plan costs vary with speed and pre-paid download volume.
Reasonable TV needs around 4Mbps these days - the highest speed and cost plans.


2007/06/03

Turnarounds

My previous post on 'Digging Out', a methodology built from experience in a number of turnarounds, can't stand alone without some justification:
What have I done to be credible?

Here's a sample taken from a version of my CV:
Completing large business critical projects on-time and on-spec. In complex political environments achieving all client outcomes without formal authority.
  • ABN online registration system - Business Entry Point, DEWRSB.
  • Y2K conversion - Goodman Fielders (Milling and Baking)
  • Y2K remediation, CDM
  • DFAT (ADCNET Release 3, ADCNET build/release system)
  • TNT
  • Unisys/Customs (EDI & Finest/Nomad)
  • CSIRO - all Australian daily weather database
  • Diskrom ($350k project income in a year)
ABN registrations:
The ATO paid for and ran the project. The software contractor was combative and unhelpful. The environment complex - around 6 different organisations for this one project, and another 10 projects. To get anything done required a huge amount of effort, negotiation and discussion.

The software contractor hadn't built any load monitoring and response time facilities into the system nor made any provision for capacity planning and performance analysis.

On my own recognisance, I designed and built some simple but sufficient tools, did the performance analysis - and accurately predicted the final peak load - 20 times the design load, and after diagnosing a catastrophic failure mode, designed a simple solution.

This 100% availability over 3 months was not accidental and directly contributed to 600,000 registrations of 3.3M being done on-line (around 15 times the estimate) and the site not attracting bad press like other aspects of the operation. Definitely a high-profile site.

The software contractor had to be dragged kicking and screaming all the way through the process. But I got the outcome the client needed - the site kept running with good response time on the busiest day. Some years later I analysed the performance data from the time and uncovered a few more nascent problems that we'd skirted around.

Goodman Fielders (Milling and Baking)
This was a Y2K project - they needed to migrate a legacy (Universe/Pick) application to a Y2K compliant platform and simultaneously upgrade the software and remerge all the 6 variants.

The application ran their business - ordering, tracking, accounting - the whole shooting match.
And for accounting systems, the Y2K deadline was the start of the financial year - 1-July-1999.

The work got done, I contributed two major items: deferring non-accounting conversion and moving the new single system to a properly managed facility.


DFAT (ADCNET Release 3, UNCLgate, ADCNET build/release system)
This was bizzare, I ended up sitting 20' from the desk I used when I worked on the system being replaced ('the IBM'), when it was being commissioned.

ADCNET failed and went to trial with the developer losing on all counts. It's worth reading the decision on 'austlii.edu.au' [Federal Court]. That account is certainly more detailed than the ANAO report. It was obvious in 1995 that the project could never deliver, let alone by the deadline. So I did my tasks and went my way.

To be called back again to degug and test an email gateway between the IBM and ADCNET (R2) for Unclassified messages. This was the first time I realised that being better than the incumbent staff in their own disciplines was 'a career limiting move'. Showing experienced, supposedly expert, programmers how to read basic Unix 'man' pages and act on them was a real lesson. A major problem that caused queued messages to be discarded was found and fixed by my testing - along with a bunch of the usual monitoring, administration and performance issues being solved.

I was called back again to help with the Y2K converstion of ADCNET (departmental staff were doing it). The system was over a million lines of code and the release/development environment bespoke. And required maintenance work on the dependencies/make side of the software had never been done. A few months part-time work saw all that tamed.

TNT
Went for a year as an admin. Did what I could, but they were past redemption... Bought out by the Dutch Post Office (KPN) soon after I'd arrived.
Presented a paper at a SAGE-AU conference detailing my experience with their 'technical guru' - who'd considered himself "World's Greatest Sys Admin". Google will find the paper for you.
It was so woeful, it defies description.

Unisys/Customs (EDI & Finest/Nomad)
In early 1995 was called in to replace a SysAdmin for 8 weeks on the Customs "EDI Gateway" project. The project and site were a disaster - so much so that Unisys listed it as a "World Wide Alert" - the step before they lost the customer and hit the law courts.

In two months the team stabilised the systems, going from multiple kernel 'panics' [a very bad thing] per week, 8-10 hour delays in busy hour and lost messages - to 100% uptime over 6 systems, 1-2 second turnarounds and reasonable documentation, change processes and monitoring/diagnosis tools. The Unisys managers were very appreciative of my efforts and contributions. This same sort of chaos that was evident in the 2005 Customs Cargo clearance System debacle. [The 'COMPILE' system ran on Unisys 2200 and was being replaced over a 10-year period. It was the back-end for the EDI systems I worked on.]

So much so, that I was called back for another few months to stabilise another system running ADABAS/Natural legacy applications that provided the Financial Management & Information Systems and Payroll/Personnel system. Another high-profile, critical system.

CSIRO - all Australian daily weather database
The research scientists on the project I worked for created some tools to analyse weather data - and had found a commercial partner to sell them. The partner was not entirely happy due to extended delays and many unkept promises. I'd been told that to buy the entire dataset - a Very Good Thing for the commercial partner - was not affordable, around $20,000 for the 100 datasets from the Bureau of Meteorology. When I contacted the BoM, they not only provided the index in digital form for free, but the whole daily datasets would cost around $1,750. I scammed access to another machine with the right tape drive, wrote scripts and did magic - and stayed up overnight reading the 70-odd tapes. In pre-Linux days, there was no easy way to compress the data and move it around.

The whole dataset as supplied was 10Gb raw - and I only had a 1Gb drive on my server [$3,000 for the drive!].

It took 6 weeks to fully process the data into their file format. And of course I had to invent a rational file scheme and later wrote a program to specifically scan and select datasets from the collection.

The Commercial Partner got to release the product at the ABARE 'Outlook' conference with a blaze of CSIRO publicity. Don't know what the sales were - but they were many times better.
The research scientist got a major promotion directly as a result, and I was forced to leave for having made it all possible.

Diskrom ($350k project income in a year)
In under a year I learnt SGML, hypertext and enough about handling large text databases [1991 - before the Web had arrived], took over and completed 3-4 stalled and failing projects, migrated datasets and ssytems, designed tools and file structures/naming conventions and completed the first merge of the Income Tax Assessment Act with the Butterworths commentary, speedup processing of a critical step by 2,000 times - all of which directly contributed $350,000 in revenue [apportioned to my effort] - or around 12 times my salary.

So it's natural that everybody else in the office was given a pay rise, I was told that I was technically brilliant but not worthy of a rise and one of the 'political players' was promoted to manage the group. With a number of other key technical 'resources' I left to pursue other avenues.

Diskrom was shut down just a few years later when a new chief of the AGPS (Aus. Gov. Printing Service) reviewed the contract and decided they were being ripped off. They'd provided all the infrastructure and services, with the commercial partner paying for staff and computers - and despite lucrative contracts and overseas work, never seen any return.

Basis of Quality - What I learnt in my first years at work

On the 17th of January 1972, with 70+ others, I started as a cadet at CSR. This early work experience set the stage for how I approached the rest of my life at work. I'm come to the conclusion that these early formative experiences are crucial to a persons' performance and attitude for their entire working life. Breaking these early patterns can be impossible for some.

The cadets of my intake had the privilege to have unrivaled experiential life lessons in quality, safety, team building & working, skills development and personal responsibility.

The lessons I took with me into my career:
  • Quality workmanship and precision techniques
  • Formal quality techniques and analyses
  • Following precise analytical processes
  • Managing high work loads and meeting tight deadlines
  • Responsibility for high-value processes/outcomes
  • Satisfying stringent customer requirements
  • Respect for, coping with and managing work in dangerous
    environments - and experience in handling/responding to hazardous
    incidents.
  • And doing time sheets and "filling in the paper work" as a natural part of things.

My first 2 years of full time work as a cadet chemical engineer at CSR Ltd saw me personally responsible every day for $10M’s of product – performing sugar analysis for the whole of the Australian sugar crop. At the time the price of sugar was at an all time high. Each of us played our part in process - no one person did it all, but one person could botch the work of a whole team, or even a whole days' work.

At these NATA certified laboratories, we were trained in Chemical Analysis - but also safety and quality methods - with lives and ‘real money’ at stake.

Routinely large quantities of flammable and explosive alcohols, and highly toxic and corrosive acids were used and safely disposed of.

Deadlines were tight and fixed – each day samples for 50-100,000 tonnes of raw sugar and full analysis had to be delivered with a very high degree of accuracy and certainty that same day.

Speed, precision and absolute dependability were instilled in us alongside a clear knowledge of the value and consequences of our work - and mistakes.

We were tutored in analytical techniques, trained in reading and following exactly processes, statistical analysis, fire and safety (OH&S) skills, certified first-aid and our duties and responsibilities to our clients - the sugar producers.

It was expected that "people make mistakes" - the first rule of any precise analysis is the error range (+/- ??). The system was designed consistently to produce accurate, repeatable results with a very low margins of error. Calibrated samples were fed through the process alongside multiple samples and any 'repeats' that had failed the checking process. The performance of individual analysts and groups was tracked and monitored. People were assigned to tasks suited to their particular talents - based on objective results, not presumption or basis.

We all acquired robust team working skills. Along with how to do the boring things like time sheets with exactitude.

The next year I spent working as an Analyst in the Pyrmont Sugar Refinery.

Lots of routine and tight deadlines - and the same attention to detail, importance of results and a necessity to understand 'the big picture'.

There'd been an analyst, an ex-secretary, who'd dropped her (mercury) thermometer into the 'product stream'. She hadn't realised it was a problem, and a few hundred tons of sugar had to be thrown away and the process shutdown for cleaning. A tad expensive - many more times her wage and the cost of training and supervision to prevent.

My routine work led to uncovering a systematic problem with a very simple and preventable cause. We had a power station on-site - a real live coal burning, electricity and high-pressure steam (50 atmosphere?) producing plant. It used enough coal that when a fire in the furnace hopper caused a malfunction, the whole of the Sydney CBD was covered in a thick pall of black smoke - which made the TV news.

The plant feed-water was supposed to be closely maintained near a neutral pH - not acidic. A test station was on the factory floor, with a reagent bottle next to it. The reagent ran out, so I made a new batch - only to find that the measurements were now grossly wrong. The bottle was stored without a lid and slowly evaporated and concentrated. Over a couple of years, the pH measurement had slowly drifted and the water pushed way out of spec.

Serendipitously a major explosion or equipment failure was avoided. Replacement of the power-station, and shutting down the whole of the Pyrmont facility dependent on it for a couple of years, would've seriously impacted the whole company.

Digging Out - Turning around challenged Technical Projects/Environments

Something I wrote in 2002:

‘Digging Out’ - 7 Steps to regaining control

This is a process to regain administrative control of a set of systems. It can be practised alone or by groups and does not require explicit management approval, although that will help.

‘Entropy’ is the constant enemy of good systems administration – if it has blown out of control, steps must be taken to address it and regain control. The nature of systems administration is that there is always more than can be done, so deciding what not to do, where to stop, becomes critical in managing work loads. The approach is to ‘work smarter, not harder’. Administrators must have sufficient research, thinking & analysis time to achieve this – about 20% ‘free time’ is a good target.

This process is based on good troubleshooting technique, the project management method (plan, schedule, control) and the quality cycle (measure, analyse, act, review).

The big difference from normal deadline based project management is the task focus, not time. Tasks will take whatever time can be spared from the usual run of crises and ‘urgent’ requests until the entropy is under (enough) control.

Recognition

Do you have a problem? Are you unable to complete your administration tasks to your satisfaction within a reasonable work week? Most importantly, do you feel increasing pressure to perform, ‘stressed’?

Gather

The Quality Cycle first step is ‘Measure’. First you have to consciously capture all the things that 1) you would like to do to make your life easier and 2) take up good chunks of your time.

The important thing is to recognise and capture real data. As the foundation, this step requires consistent, focussed attention and discipline.

The method of data capture is unimportant. Whatever works for the individual and fits naturally in their work cycle – it must NOT take significant extra time or effort.

Analyse

Group, Rewrite, Prioritise.

Create a ‘hard’ list of specific tasks that can be implemented as mini projects that can be self managed. Individual tasks must be achievable in reasonable time – such as 1-2 days effort. Remember you are already overloaded and less than fully productive from accumulated over stress.

Order the list by 1) business impact and 2) Daily Work-time gained.

The initial priority is to gain some ‘freeboard’ – time to plan, organise and anticipate, not just react.

Prioritisation can be done alone if there is not explicit management interest.

It will surprise you what management are prepared to let slide – this can save you considerable time and angst.

Act


Having chosen your first target, create time to achieve it. This requires discipline and focus. Every day you will have to purposefully make time to progress your goal. This means for a short period spending more time working or postponing

Do not choose large single projects initially, break them into small sub projects.

When you start, schedule both regular reviews and a ‘drop-dead’ review meeting – a time by which if you haven’t made appreciable progress on your task to review

Review

How did it go? Did you achieve what you wanted? Importantly, have you uncovered additional tasks? Are some tasks you’ve identified not necessary.

If your managers are involved, regular meetings to summarise and report on progress and obstacles will keep both you and them focussed and motivated.

‘Lightweight’, low time-impact processes are the watchword here. You are trying to regain ‘freeboard’, you do NOT need additional millstones dragging you further into the quagmire.

Iterate

Choose what to do next. If you’ve identified extra or unnecessary work items, re-analyse.

When do you stop this emergency management mode? When you’ve gained enough freeboard to work effectively.

A short time after the systems are back in control and you are working (close to) normal hours, you should consider scheduling a break. You’ve been overworking for some time and have lost motivation and effectiveness. A break should help you freshen up, gain some perspective and generate ideas for what to do next.

Maintain

What are you and your managers going to do to keep on top of things? How did you slide into the ‘tar pit’ in the first place? What measures or indicators are available to warn if this repeats.

How will you prevent continuous overload from recurring?

2007/06/02

Commercial Software is Good - because you Have Someone To Sue.

A friend came back from a ITIL Practitioners course with an interesting story:
  • The course was mostly about the complexities of handling Commercial Licenses, no two of which are the same.
  • The course provider made the not unexpected statement about Open Source:
    "Don't use it because you have nobody to sue".
    And they went onto ask "Why use OSS?"
His response was: "Because it's best of breed, especially rock-solid utilities & tools".
And they continued to not listen...

This note is NOT about that particular mindset [Risk Avoidance, not Risk Management].

I'd like to give him, and other technical people like him, a "slam dunk" one-liner response to each of these questions:
  • Why OSS - because it's best of breed!
    Why use a bug-ridden, poor functioning piece of commercial software when the best there is is rock-solid, secure & Free and Open?
    Not only do you remove the need to sue *anybody*, you get the best tool for the job and know it will never be orphaned, withdrawn or torpedoed.

    Or you may be held to ransom with enormous support costs - the Computer Associates model of buying 'mature' software and raising the support costs to turn a profit until the customer base bails out.
  • Using rock-solid OSS apps. means you are unlikely to need to sue anybody. It "just works", not "works just".
    And if you have concerns over "prudent commercial Risk Management",just hire an OSS support organisation who's got both "Professional Indemnity" and OSRM insurance.

And I tried to quickly find *two* significant lists for him:
  • widely used open-source software [Apache, Samba, Perl-PHP-Python-Ruby, gcc/make/cvs/subvers, Eclipse, ...]

    The caveat on this list, is that I need estimates of the market share or extent of use of the software. Viz: For apache, the netcraft survey:

  • OSS support organisations. [remember Linuxare ?]
If you have pointers to that data, I've love to hear from you.

Who can we sue? Or - the Myth of Riskless I.T. Management

This started as a conversation on an Open Source list - on how to respond to people that assert:

"We can't use Open Source because there's Nobody to Sue".

2007/06/01

Why "IT Service Delivery and Management" should be an Academic Discipline

IT Service Delivery is where "the rubber hits the road". Without sufficiently capable and reliable IT infrastructure every other piece of the IT equation - Architecture and Design, Business Analysis, Project Management, Software Engineering, Software Maintenance, Information Management, Data Modelling. ... - becomes irrelevant. All other effort is wasted is the services don't run for the users.



All the potential benefits of IT, the 'cognitive amplifier' effects and leveraging people's skills and experience, rely on the delivery of IT Services.



Where is the academic discipline that:

  • Defines IT Service Delivery (and it's OAM components - Operations, Administration and Maintenance)
  • Provides a framework to compare and audit the performance of IT Service Delivery in an organisation to benchmarks relative for the industry.

  • Defines absolute and relative performance of individuals, teams and IT organisations.
  • Defines and explores performance, effectiveness, utilisation and 'service benefit conversion' metrics?



If the goal of IT/IS is to "deliver a business benefit" - but the benefits aren't trivially measurable, the deep knowledge/experience of the discipline of Marketing can be brought to bear. The first step in every project is to define 'the desired benefit', how it will be measured and reported, and the break-even or cancel point.



The academic discipline that informs practitioners, management and the profession on how to actually realise the benefits of IT/IS systems in practice.





ITIL and ISO 20,000



"ITIL" (IT Infrastructure Library) was first formulated in 1989 by the UK OGC (Office of Govt. Computing) to provide a common language and framework or conceptual model for IT Operations (now 'service management') - necessary for the process of tendering and outsourcing.



In 1999 it was re-released ('version 2') and new books written.

Mid-2007 sees the release of 'version 3' - another rethink and rewrite.



In 2000 ITIL spawned a British Standard, BS15000, revised and updated in 2002. In Dec 2005 BS15000 was adopted internationally as IEC/ISO20,000. It sits alongside "Information Security Management" ISO17799 [formerly BS7799:2002] and "Information Security" ISO27001. BS25999 address "Business Continuity".



Forrester Research in early 2007 reports (in "CIOs: Reduce Cost By Scoring Applications" by Phil Murphy) that 'IT Service Delivery' (Forrester calls it "lights on" operations and maintenance) is accounting for a rising percentage of IT budgets. Referenced in "Maintenance Pointers".



Reasons for a discipline of IT Service Delivery and Management



  • The Forrester survey of October 2006 reports IT Service Delivery consumes 80% of more of IT budgets - up from 60-65% ten years ago.
  • 100% of the User Utilisation of IT Software, Systems and Services is mediated by Service Delivery. It's where "the rubber hits the road".

  • IT is useful in business because it's a cognitive amplifier - it amplifies the amount of useful work that people can perform/process. IT provides "cheaper, better, faster, more, consistent/correct".
  • Business and Government are now dependent on their IT. We've crossed the event horizon where [in the 'developed' world] it's possible to resile from IT systems.
  • IT is arguably still the greatest single point of leverage [staff effectiveness amplifier] available to organisations.
  • Service Delivery is the anchor point for "What Value does IT Deliver?"



Where the 'IT Services' discipline belongs

There are two requirements for a faculty teaching 'IT Services' or 'IT Service Management'

  • Business and Management focus, and
  • Ready access to large, complex "IT Services" installations
Traditional computing and IT faculties are focussed on the internal technical aspects of computing. 'IT Services and Management' is about delivering and realising Business Benefits - the managerial focus. The necessary disciplines and knowledge/expertise already exist in Business/Commerce/Management Schools - and are somewhat foreign to traditional CS/ISE/IT Schools.



Canberra, accounting for 20% of the IT expenditure in Australia, is well placed to initiate 'IT Service Delivery and Management' in this country.

2007/05/28

Flash Memory, Disk and A New Storage Organisation

The raw data for this table. It's not definitive, but meant to be close to reality. I haven't included tape media, because I have no reliable price data - and they are not relevant to the domestic market - CD's and DVD's are one of the best, cheapest and most portable/future proof technologies for enterprise and domestic archives and backups.
A Previous Post quotes Robin Harris of ZDnet (Storage Mojo).

Edit (04-jun-97): George Santayanda on his storage sanity blog writes on the Flashdance. Cites 80%/pa reduction in flash prices, break-even wit HDD on 2010/11. He's been in storage for years and is a senior manager. And doesn't take things too seriously.

And he points to a Powerpoint by Jim Gray on the Flash/Solid State disk. Worth the read.

Storage Price Trends

The Yr/Yr ratios are used for forward projections.

The 'Est Flash' column uses the current 'best price' (MSY) for flash memory and a Yr/Yr ratio of 3.25.

Flash memory is now very much cheaper than RAM - forward projections not done.



YearRAM $/GbFlash $/GbEst Flash $/GbDisk$/GbDVD $/GbMax flashMax Disk
2002936.001176.001176.003.130.57256120
2003384.00700.00700.002.750.38512160
2004298.00400.00400.001.290.261024250
2005189.00158.00158.000.820.171024320
2006135.0087.5087.500.620.172048500
2007159.0024.7511.060.370.154096750
Yr/Yr ratio

2.263.251.571.33

2008

10.953.400.240.11



2009

4.841.050.150.08



2010

2.140.320.100.06



2011

0.950.100.060.05



2012

0.420.030.040.04







Depending on the"Year-on-Year" ratio you choose for the reduction in $/Gb of Flash memory, and if you think both flash and disk drives will continue their plunge down the price curve, solid state memory (flash) may be the cheapest form of storage in under 5 years.



New Storage Organisation

Backups and Archives

With the price of large, commodity disk drives driving down near DVD's, and probably overtaking, within 5 years - and that's ignoring the cost of optical drives and the problems of loading the data you want. Why would you not want to store backups and archives on disk?



For safe backups, the disks cannot be in the same machine, nor actually spinning. If the disk is in a USB enclosure, this meaning being able to spin it down by command.



Small businesses can effect a safe, effective off-site backup/archive solution by pairing with a friend and using 'rysnc' or similar over the Internet (we all have DSL now, don't we?) to a NAS appliance. The NAS does need to store the data encrypted - which could be done at source by 'rsync' (yet another option) or created by using an encrypted file system and rsync'ing the raw (encrypted) file. Best solution would be to have the disks spin-down when not being accessed.



This technique scales up to medium and large businesses, but probably using dedicated file-servers.



And the same technique - treat disks like removable tapes - applies. If drives are normally kept powered down, the normal issues of wearing out just won't arise. Rusting out might prove to be an issue - and little effects like thermal shock if powered up when very cold.





Speed, Space and Transfer Rate

Robin Harris of ZDnet etc writes on Storage. He's flagged that as commodity disks become larger, a few effects arise:

  • single-parity RAID is no longer viable, especially as disk-drive failure is correlated with age. Older drives fail more often. The problem is the time to copy/recreate the data - MTTR, and the chance of another failure in that window whilst unprotected.

  • Sudden drive failure is the normal mode. Electronics or power supply dies - game over.

  • Common faults in a single batch/model are likely to cause drives to fail early and together, and
  • RAID performance is severely impacted (halved) when rebuilding a failed drive.
Harris is fond of quoting a single metric for drives: (I/O per second) per Gb. Which is a great way to characterise the effective speed of drives. The size of drives has been doubling ever couple of years - but the speed (rotational and seek) has been increasing much slower... Big drives, even lashed together in RAID arrays, can't deliver the same effective performance as a bunch of smaller, older drives.



This single figure of merit is half the equation: The other side is "time to copy".

That scales with transfer time, size, on-board cache and sustained read/write speeds.



What the World Needs Now - a new filesystem or storage appliance

Just as disk drives are bashing up against some fundamental limits - bigger is only bigger, not faster - Flash memory is driving down in price - into the same region as disks. [And nobody knows when the limits of the magnetic recording technology will be reached - just like the 'heat death' of Moore's Law for CPU speed in early 2003.]



Flash suffers from some strong limitations:

  • Not that fast - in terms of transfer rate
  • Asymmetric read and write speeds (5-10:1)
  • bits wear out. Not indefinite life.
  • potentially affected by radiation (including cosmic rays)
But it's persistent without power, physically small, very fast 'seek time', relatively cheap per unit, simply interfaced, very portable, (seems) reliable and uses very little power. Cheap flash memory only transfers around 5Mb/sec. Sandisk "Extreme 3" Compact Flash (CF) cards targeted at professional photographers, write at 20Mb/sec (and "extreme 4" double that).



"Plan 9", the next operating system invented by the group who designed Unix (and hence Linux), approached just this problem. Files were stored on dedicated "File Servers" - not that remarkable.



Their implementation used 2 levels of cache in front of the ultimate storage (magneto-optical disk). The two levels of cache were memory and disk.



The same approach can be used today to integrate flash memory into appliances - or filesystems:

  • large RAM for high-performance read caching.

  • large, parallel flash memories for read buffering and write caching
  • ultimate storage on disk, and
  • archives/snapshots to off-line disk drives.

The disk drives in the RAID still have to have multiple parity drives, and hot spares.




The Flash memory has to be treated as a set of parallel drives - and probably with parity drive(s) as well.


This arrangement addresses the write performance issues, leverages the
faster read speed (when pushing cache to disk) and mitigates the effect
of chip failure, bits wearing out and random 'bit-flips' not detected
and corrected internally.



They only deep question is: What sort of flash drives to use?

  • compact flash are IDE (ATA) devices. Same pin-out (less the 2 outside) as 2.5" drives
  • SD card is small, cheap, simple - but bulk connection to computers aren't readily available

  • USB flash is more expensive (more interfaces), but scales up well and interfaces are readily available.
  • Or some new format - directly inserted onto PCI host cards...

USB or CF is a great place to start.

CF may cause IDE/ATA interfaces to re-emerge on motherboards - or PCI IDE-card sales to pick up.

2007/05/23

Why Ideas are 'Cheap' or Execution is everything

"Genius is 1% Inspiration and 99% Perspiration": Thomas Alva Edison

Summary

Ideas cost very little to generate and without substantial additional effort, come to nothing. But new ideas are the only starting point for new things - so which is more important, coming up with the idea or making it concrete? Both are necessary and as important as one another - without the other, neither will lead anywhere. Criticism of others ideas without substantial evidence,proof, counter example or working demonstration is churlish. "Put up or Shut up" is a reasonable maxim for critiquing ideas. Ideas only take real form and viability if they are the subject of robust and probing debate and defence. It is better to fail early, amongst friends, than publicly and spectacularly. It's easy to confuse a profusion of ideas with "invention". The marker of "usefulness" is the follow-through from ideation to implementation.

2007/05/19

Microsoft, AntiTrust (Monopolies) and Patents

MSFT is threatening to Sue the Free World.
Patents are a state-granted Monopoly in return for full disclosure.
  • Patents are useless unless defended.
  • Patents are granted in a single jurisdiction at a time - there are no 'global' patents.
  • Patents are uncertain until tested in court - by the full panoply of judges, counsel and mountains of paper.
  • Patent 'trolls' and 'submarining' exist (and are legal tactics) - people who play the system for Fun and Profit. They hide out until someone is successful, then don't try to license their patents - but sue (for large amounts).
Microsoft may claim that code infringes its patents - but that's just a posture. If they were for real, they'd be launching court cases to decide the matter.

2007/05/15

Microsoft Troubles - III

Microsoft threatens to Sue The Free World.

Groklaw comments on MSFT threatening to sue "Patent Violations".

CNN/Fortune Original article (probably)

ZDnet (Mary Jo Follet)

2007/05/06

Driving Disks into the future

Robin Harris of ZDnet "Storage mojo" has written a series of posts on factors affecting the future of Disk Storage. These are my reactions to these quotes and trends, especially in flash memory.



Flash getting "70% cheaper every year" - hence more attractive:



"Every storage form factor migration has occurred when the smaller size reached a capacity point that enabled the application, even though it cost more per megabyte."

"With flash prices dropping 70% a year and disks 45%, the trend is inexorable: flash will just get more attractive every year."




The problems with RAID and Big Drives:



"There are three general problems with RAID: Economic, Managerial, Architectural"

  • RAID costs too much
  • Management is based on a broken concept [LUN's]
  • Parity RAID is architecturally doomed
"The big problem with parity RAID is that I/O rates are flat as capacity rises. 20 years ago a 500 MB drive could do 50 I/O per second (IOPS), or 1 IOPS for every 10 megabytes of capacity. Today, a 150 GB, 15k drive, the ne plus ultra of disk technology, is at 1 IOPS for every 750 MB of capacity. Big SATA drives are at 1 IOPS per several gigabytes. And the trend is down."






What a "Web Business" wants from Storage Vendors:



What a Web Business wants [Don MacAskill of 'smugmug']:

  • External DAS for the database servers .. and dual-controller arrays [simplified recovery after server death]
  • Spindle love. Typical array has 14.
  • No parity RAID. RAID 1+0.
  • 15k drive love. Speed is good.
  • Love drive enclosures with odd numbers of drives. Makes keeping one hot spare easy.
  • Love big battery-backed up write caches in write-back mode. Because super-fast writes are “. . . easily the hardest thing in a DB to scale.”
  • Disable array read caching: array caches are small compared to the 32 GB of RAM in the servers. reserve all array cache for writes.
  • Disable array pre-fetching: the database knows better than the array.
  • Love configurable stripe and chunk sizes. 1 MB+ is good.


"Don should be the ideal array customer: fanatical about protection; lots of data; heavy workload, not afraid to spend money. Yet he isn’t completely satisfied, let alone delighted, by what’s out there. A lot of the engineering that goes into arrays is wasted on him, so he’s paying for a lot of stuff he’ll never use, like parity RAID, pre-fetch and read caching."




And 'the future of Storage'



The future of storage:

"The dominant storage workload of the 21st century. Large file sizes, bandwidth intensive, sequential reads and writes."



"(OLTP) Not going away. The industry is well supplied with kit for OLTP. It will simply be a steadily shrinking piece of the entire storage industry. OLTP will keep growing, just not as fast as big file apps."



"Disk drives: rapidly growing capacity; slowly growing IOPS. Small I/0s are costly. Big sequential I/0s are cheap. Databases have long used techniques to turn small I/Os into larger ones. With big files, you don’t have to."



"The combination of pervasive high-resolution media, consumer-driven storage needs, expensive random I/0s and cheap bandwidth point to a new style of I/O and storage. The late Jim Gray noted that everything in storage today will be in main memory in ten years. A likely corollary is that everything analog that is stored today will be digital in 10 years."

2007/05/04

Response to Cognitive Work Load - Huh?

Another related question I've been trying to even discover the correct name for over the last 10 years.
I can't believe something so fundamental to I.T. and the knowledge economy could go unstudied.

I frame it as "Human cognitive response to workload".

There is a whole bunch of data on "human physiological response to workload" - like the US Navy and how long stokers can work at various temperatures (and humidity?).

This goes to the heart of computing/programming - being able to solve difficult problems, and managing/reducing defects/errors. In my career, I got very tired of bosses attempting to get more work done by "forced marches". 80 hour weeks aren't more productive - they just insure a very high defect rate and amazing amounts of rework.

The best I have been able to find is Dr Lisanne Bainbridge and her work on "mental load".

What I wanted to discover is:
  • that for each individual there is an optimal number of 'brain work' hours per week
  • the effect of physical & mental fatigue and sleep deprivation on 'brain work' output, degree of difficulty tasks and error rate.
  • the recovery time for strenuous (mental) effort - working 50, 75 and 100 hours / week requires recovery, but how much?


If you, gentle reader, have any leads/pointers on this I really appreciate it :-)

Even if someone knows what the field is called or refer me to the peoplethat do know.

Teams - Where's the proof?

Addition 24-May-2007
Johanna Rotham author and consultant answered an e-mail from me.
Johanna is involved in the Jerry Weinberg and Friends AYE - Amplifying Your Effectiveness - conference. Johanna's book "Behind Closed Doors" is on this blog and highly recommended for I.T. Technical Managers. Another interest of Johanna's: Hiring the Best People.

Johanna's thoughtful response:
Part of the problem is I can't do two of the same project where one is set up as an integrated team and the other is a bunch of people who don't have integrated deliverables. I can tell you that the projects where the people are set up with committed handoffs to each other (Lewis' idea that one person can't work without the rest of them), have better project throughput (more projects per time period) than the groups of people who do not have committed handoffs to each other. But that's empirical evidence, not academic research.

2007/05/02

Defining I.T. Service Management

Objectives (The What)


Having begun around 1950, the world of Commercial I.T. is now mature in many ways. "Fields of Work" and professional taxonomies are starting to become standardised. Professional "Best Practices" are being documented and international standards agreed in some areas.

For the first time, audits of one of the most pragmatic I.T. disciplines, "Service Management", are possible with ISO 20,000. Business managers can now get an independent , objective opinion on the state of their I.T. operations - or of their outsourcers.

Being "documented common sense", ITIL and the related ISO 20,000 are good professional guides, but not underpinned by theory. Are there any gaps in the standard? How does Service Management interface with other IT Fields of Work? and What changes in those other disciplines are necessary to support the new audited practice?

Analysis of the full impact of I.T. Service Management, creation of a full taxonomy and definitions of "I.T. Maturity" are beyond the scope of a small "single researcher" project.

Approach (The How)

ITIL Version 2 and 3 and ISO 20,000, as published documents, form the basis of the project.
Prior work in the field has yet to be identified. Secondary research will be the first step.

Each of the models will be codified and uniformly described, then a 3-way comparison performed. A Gap Analysis done of the 3 models, and a formal model built describing "I.T. Service Management" and its interfaces built and each of the existing approaches mapped to it.

Importance/Value (The Why)

The global economy, especially businesses in the "Western Industrialised World" are increasingly dependent on I.S./I.T. and their continued efficient operation. Corporate failures partially due to I.S./I.T. failure have occurred. Improving delivery of I.T. Services and the business management and use of them is important to reduce those failures in the future.

The advent of ubiquitous and universal computing requires concomitant development of business management.

There assertions are considered axioms in this context:
  • Organisations these days are dependent on their I.T. Operations.
  • I.T. cuts across all segments of current organisations.
  • I.T. defines the business processes and hence productivity of the whole organisation.
  • What you don't measure you can't manage and improve.
  • Improving the effectiveness of I.T. Operations requires auditable processes.
  • Common I.T. Audit and Reporting Standards, like the Accounting Standards, are necessary to contrast and compare the efficiency and effectiveness of I.T. Operations across different organisations or different units within a single organisation.
I.T. is a cognitive amplifier, it delivers "cheaper, better, faster, more, all-the-same", through the embedding of finely detailed business processes into electronic (computing) systems.

For simple, repetitive cognitive tasks, computers are 1-5,000 times cheaper than people in western countries.

From this amplification effect, computers still provide the greatest single point of leverage for organisations. The underpin the requirement to "do more with the same", improving productivity and increasing profitability.

The few studies of "IT Efficiency" that are available show that IT effectiveness is highly variable and unrelated to expenditure.

The value-add to business of a complete I.T. Service Management model is two-fold:
  • manage down the input costs of the I.T. infrastructure and Operations and,
  • audit assurance for the board and management of the continued good performance of I.T. Operations.


[A 1990 HBS or MIT study into "White Collar Productivity" - reported a decrease in the first decade of PC's]

Previous Work (What else)

There is much opinion in the area, without substantive evidence: e.g. Nick Carr and "Does IT Matter?" The McKinsey report/book on European Manufacturers and their I.T. expenditure versus financial performance shows there is no co-relation between effort (expenditure) and effect (financial performance).

"Commonsense" IT Practitioner approaches, SOX, ITIL and COBIT and others, do not address the measuring and managing of I.T. outputs and interfaces and their business effects, utiliation and effectiveness.

Jerrry Landsbaum's 1992 work included examples of their regular business reports - quantifiable and repeatable metrics of I.T. Operations phrased in business terms.

Hope to find (The Wherefore)


  • Create a formal model for I.T. Operations and its performance within and across similar organisations.
  • From the model, generate a standard set of I.T. performance metrics.
  • Generate a set of useful I.T. Operations Business Impact metrics.


Report Outline


  • Coded process models of ITIL version 2, 3 and ISO 20,000.
  • 3-way comparison of ITIL version 2, 3 and ISO 20,000.
  • Gap Analysis of ITIL version 2, 3 and ISO 20,000 models.
  • Formal I.T. Service Management model.
  • Common I.T. Service Management internal metrics and Business Impact
    metrics flowing from the model.
  • Interfaces to other I.T. and business areas and changes necessary to support audits of I.T. Service Management.
  • Further Work and Research Questions


Execution Phases

  • Learn ITIL Version 2 - Service Managers Certificate course [complete]
  • Learn ISO 20,000 - IT Consultants training [in process]
  • Acquire and learn ITIL Version 3 [depends on OGC availability. mid/late 2007]
  • Create/identify process codification.
  • Codify ITIL version 2, 3 and ISO 20,000
  • Compare and contrast coded descriptions. Report.
  • Create/adapt process description calculus for formal model.
  • Create formal I.T. Service Management model.
  • Derive interfaces to business and other I.T. processes
  • Derive internal metrics, role KPI's and business impact metrics
  • Finalise report.

2007/05/01

Bookshelf I

These are books on my bookshelf I'd recommend. Notes on them later.
Pick and choose as you need.

Personal Organisation


Personal Efficiency Program - Kerry Gleeson [older]
Getting Things Done - David Allen [newer]

Teams, People, Performance


Practice What you Preach - David H Maister [Numerical model relating Profitability to Staff Morale/Treatment]
How to be a Star at Work - Robert E Kelley
Team Management Systems - Margerison & McCann

Maimum Success: Breaking the 12 Bad Business Habits before they break you - Waldroop & Butler
[rereleased as] The 12 Bad Habits that hold Good People bBack

No Asshole Rule - Robert Sutton

Execution - the art of Getting things done (in big business)


Who says Elephants can't Dance - Louis V Gerstner [on Execution and 'Management is Hard']
Execution - Bossidy & Charan
Confronting Reality - Bossidy & Charan

Gallup Research


First, Break all the Rules - Buckingham & ??
Now, Discover your Strengths - Buckingham & Clifton [old]
StrengthsFinder 2.0 - Tom Rath [current]
12: The Elements of Great Managing - Wagner & Harter
The One thing you need to know - Buckingham

Off the Wall - Different ideas on Management and Leadership


Contrarian's Guide to Leadership - Steven B Sample
Simplicity - Jensen

Intelligent Leadership - Alistair Mant [old]
Maverick - Ricardo Semler. [old]
The 7-day Weekend - Ricardo Semler [new]

Wear Clean Underwear - Rhonda Abrams
Management of the Absurd - Richard Fearson
Charisma Effect - Guilfoyle

Computing Management


Measuring and Motivating Maintenance Programms - Jerry Landsbaum
any of the 50 books by Robert L. (Bob) Glass

Why Information Systems Fail - Chris Sauer
Software Failure : Management Failure - Flowers

Jerry Weinberg Prolific author - Quality, People, Teams, Inspections & Reviews, technical, ...
Quality Software Management - 4 book series
Becoming a Technical Leader
Weinberg on Writing - the Fieldstone Method
Secrets of Consulting
Psychology of Computer Programming
-- and another 40 or so --

Peopleware - DeMarco & Lister from Dorset House Publishing specialising in why people matter.

Project Retrospectives - Norm Kerth
Programming on Purpose - PJ Plauger

Mythical Man Month - Fredrick Brooks [I don't have a copy]

"IT Doesn't Matter" - Nicholas G Carr [read a synopsis, don't buy]

2007/04/20

The End of the Internet, or the Microsoft Users Net-Meltdown?

The 2005 Australian Computer Crime and Security Survey(PDF) reports that at the end of 2004 "the hackers turned pro". The 2006 ACCSS indexACCSS index may be easier for downloads. [In 2016, the ACCSS was replaced by "the BDO and Australian Cybercrime Survey".]

For 2-3 years now, most malware has satisfied the definition of Organised Crime:
it's theft, it's purposeful, it's co-ordinated.

In an August 2006 post, I reported the ACCSS comments and new comments from SANS .

ZDNet now report that Rootkits becoming increasingly complex and operate by stealth. They say:

Rootkits -- malicious software that operates in a stealth fashion by hiding its files, processes and registry keys--have grown over the past five years from 27 components to 2,400, according to McAfee's Rootkits Part 2: A Technical Primer (PDF).
If you use a Microsoft system and connect to the Internet without extensive protection, you should be afraid, very afraid. And even large organisations who do everything right, are still open to targetted "zero day" attacks. The first Windows Vista security problems are being reported. It's better than their previous efforts, but still contains significant security flaws. The Whitehouse mandated a minimum security configuration for all US Federal Government Vista destops.


2007/04/10

Microsoft troubles - II

Follow up to a previous post on MSFT hitting a 'financial pot hole' by 2010. The numbers look very, very bad to me. The seeming lack of management response and apparent leadership would deeply disturb me as a shareholder...
The Paul Graham piece Microsoft is Dead and the follow-up were a prompt for this post.

2007/04/09

Startups: selecting and nuturing.

A comment on Paul Grahams post Why to Not Not Start a Startup.

Paul along with Robert T Morris (author of the 1988 Morris Worm, now MIT assoc. professor) run a Venture Capital firm.
They run Startup School as well. An exceptional idea.

At the end of this is a list of Paul's 16 points.

2007/04/08

Web 2.1 - Meta-tags by default

Why do we need fine products like Content Keeper, when the problem is one that should be solved at source?

[11-Apr-2007 Addition]
The "Kathy Sierra" affair caused Chris Locke, co-author of Cluetrain Manifeso to post his version/take. My take from reading about the affair.
This whole affair unfolded because "Web 2.0" not just allows, but
enforces, anonymity. Provable Identities don't exist.

In an hour's scrolling through posts, I never saw this point [or anything like it] made.
How far would this thing have gone if the police could've tracked the posters quickly and unequivocally?
Presumably within a day or so the perpetrators would've been identifiedand action initiated, legal jurisdictions allowing.

There are good reasons to allow & support anonymity on the Web -"Freedom of Speech" is part of it, along with denying Political suppression and enabling 'whistleblowing'.

But the ugly human stuff of stalking, intimidation and control-by-fear need effective checks and consequences.

[End Addition]

Knowing the type of content you are downloading is a basic right - the same way that we don't go into newsagencies, bookshops and libraries and get surprised by the content. The same way that various TV stations will broadcast 'social content' warnings before some programs (violence, 'disturbing or graphic images', 'images of deceased people' and even 'images of surgery'). Our society has very well developed methods of flagging content that some audiences may wish to avoid - right up to full TV, movie & print "classification" and censorship. Plus we have blanket bans, enshrined in legislation, on things like "kiddie porn" and "snuf movies".

Simple minded banning of pages based on keywords or URL makes a priori judgements of what will and won't offend the audience - or under high-control regimes, what is or is not banned/seditious material. Then it becomes a simple "arms race" - two camps competing against one another (attack and defense), and by definition the reactive side can only respond once a new exploit/mechanism is noticed and identified. Yep, it's effective against people obeying the rules, but at the price of massive collateral damage and never being sure you're not compromised.

Generally, the USA is particularly sensitive to sexual matters, but not to violence. Sweden mostly has very different mores...
Filtering all pages that mention 'breast' or it's (English language) derivatives and colloquialisms fails in many ways, especially for medical & pregnancy issues ('false positives') and is easily circumvented by mistyping, obfuscation or using images ('false negatives') and is completely irrelevant for non-English language pages.

In the world of IT Security, this is why we now have Firewalls andIntrusion Detection Systems [and now systems that actively seek to confuse/entrap/counter attackers.] Funny - just like in the real world.

I'm thinking the web-server is the place to insert consistent meta-tags into content.
And that requires a minimum additional two publication stages - author, reviewer, editor/publisher - [as described by Peter Miller in his Aegis Documentation piece (82Kb PDF ) Aegis Is Only For Software, Isn't It?].

Nothing publicly published should go untagged - and that needs independent review and an enforced process to
[OK, so where does that leave the wonderful world of 'blogs'?]

We live in interconnected communities, now global in Cyberspace. All of us have sensitivities that should be respected and the publishing world evolved over many centuries a tradition of "no surprises". It's a convention that has served us well before Cyberspace, it would serve us there as well or better - with everyone "just one click away" from your content.

Free Speech is only a Right in some countries.
Censorship is a given and necessity, even in the most "enlightened" countries - where it might be called 'national security' :-)
And there are globally shared mores/values/injunctions against such things as child pornography and worse.

It's not an even playing field, and will never, can never, be.

My opinion is that laws like the DMCA [USA - Digital Millennium Copyright Act] and the Australian "anti-spam and pornography" laws [no refs] are wrong-headed and irrelevant at best - and counter-productive at worst.

With the Global Net and One Shared Cyberspace, and many cultures, beliefs, religions, etc etc, "Web 2.0" needs to add:
mandatory content tagging.

Then we can adibe by our tired-and-true convention "no surprises" and respect all our differences and sensitivities.

2007/04/03

Selling Good Goverance - I.T. Services Audits

IBM got to be bigger, by turnover, than everyone else combined for nearly two decades, accounting for up to 60% of IT sales. One of the chief factors was they were good salesmen - they knew their audience: who to target and what things they wanted (and only sell to people that can sign the cheque!)

IBM didn't sell to "techos" - but managers, the more senior the better. They talked their language (cheaper, better, faster) and gave solid "Dollars and Cents" Costs and Benefits. They got to come back because they generally made good on those promises.

Selling I.T. Services Audits, Security and Continuity


These functions are Goverance related and should be contolled and reported directly to Board Level - not even senior management or CEO.

Board Pitch


Can your Business run without Accounting??
  • No!

Can it run without it's I.T. services?
  • No!

What part of your business isn't affected by I.T.?
  • None!

Why do you have Accounting Audits?
  • "Have to" - regulatory requirement.
  • "credibility enhancer" - investors and owners can trust the figures claimed.
  • Integral to Good Goverance. The things the Board want done, are being done.

Why don't you do I.T. Services, Security and Continuity Audits?
  • Ummmmm?


If you're entrusted with husbanding other peoples money, not assuring and insuring the I.T. Services of the business isn't sound practice.

Major failures/events in anyone of these functions is high impact: They are "Bet the whole company".
The sort of decision that the owners need to make, and make consciously.

Supporting Facts


From a Sarbanes Oxley site:
Fifty percent of companies that lose their data go out of business immediately and ninety percent don't survive more than two years, according to research firm Baroudi Bloor International. ...
Only three percent of all data loss is caused by fire, flood and other such disastrous events. The most common causes are hardware or system malfunction (44 percent), human error (32 percent), software corruption (14 percent) or viruses (7 percent). ...
And remember, without your business's data, there's no business at all.


In a brief report on a fire in a British Telecom hub in Manchester affecting 136,000 phone lines:
  • 86 percent of firms affected found the fire was disruptive and it had an impact on voice communications in 60 percent of those polled....

  • Just 34 percent had a disaster recovery or business continuity plan in place ....

  • Those polled showed low awareness of solutions, nor did most appreciate the need for business continuity planning. 71 percent saw little value in automatic call diverts in emergency situations and 70 percent of those polled were unaware that banks expect businesses applying for loans to have a proven disaster recovery plan in place.


In 10 Steps to surviving a disaster(PDF)
According to the Association of Records Managers and Administrators, about 60 percent of businesses that experience a major disaster such as a fire close within two years. According to Labor Department Statistics, over 40 percent of all companies that experience a disaster never reopen and more than 25 percent of those that do reopen close within two years.


And from Glen Abbot, Scotland’s leading supplier of Business Continuity Services.

Business Failure

A business failure is defined as:
"An occurrence, and/or perception, that threatens the operations, staff, shareholder value, stakeholders, brand, reputation, trust and/or strategic/business goals of an organisation."

In a five-year period, twenty percent of companies within the UK will suffer some kind of serious disruption to their operations. This may be as a result of an IT failure, emergencies such as fire or flood, or some other unplanned disruption. Eighty percent of those companies who suffer a serious disruption suffer severe losses or fail to survive in business during the following eighteen months (National Audit Office).


And yet more in the Reader Comments section of this piece on 'Continuity Central'.

2007/04/02

Three Metrics to change our business

In a previous post, Research Outline,3 sets of metrics were proposed that, if applied consistently across large organisations, would change the face of our industry (IT&T), perhaps even support the transition to a Profession.

"IT is done for a Business Benefit"


After 50+ years of doing it, we are looking at the end of the Silicon Revolution by 2010. Already we've passed the end of Moore's Law for CPU speed [Q1-2003]. But more than that - Business & Government are getting hard-nosed about IT&T delivering 'value'.

The IT recession we're just coming out of was a direct reaction against the perceived needless waste of Y2K. The other in 1991 was the marker that all the 'easy wins' in IT had been achieved and IT itself could be cut.

Big Business and Government account for over 60% of the Australian GDP. Around 45% of GDP is influenced directly by IT&T - with an investment rate of around 10% - $45Bn/year for 'the majors'. Globally, multiply this by 50-60 times. [Source: ABS surveys]

Compare this to the ~$50Bn earnings by all companies listed on the ASX. Leveraging IT&T whilst containing costs is a central concern of all good business execs - and becoming more so. Shaving 1% off IT&T inputs goes directly to the bottom line and allows good companies to easily outperform their competitors.

My belief is that the first people to adequately address these questions in quantifiable terms will dominate the market . And what better way than to charge than a percentage of the realised savings? For a consulting firm, that's putting it's money where it's mouth is...

Metrics


The three sets of figures I'd like to produce are linked to this central question:
Doing More with Less.

  • What's the leverage IT&T gives us? [Virtual Employees]
    • Year on Year reporting from a consistent base.
  • Where do our IT&T costs go? [Standard reporting in Business Inputs andOutputs]
    • Are we getting a good deal from our IT&T?
    • Comparing to what?
  • How effective are our IT&T processes? [Benchmarked KPI's]
    • If ITIL is the answer, how well are our folks doing it?
    • How much more room for improvement is there?


And the worst thing that could happen is:
You find out your IT&T people do a good job.

2007/03/23

Future Forecasting for I.T. - how close to 'mature' is the market?

Jonathon Schwartz of SUN Microsystems posted an article on SUN and Intel Alliance. SUN may be coming back from the brink - with the 'opening' of Solaris, they could have realised again they're a hardware company (and do great servers).

There was a line that gave me pause:
To be clear, this isn't about displacing one another's competitors, it's about getting as big a piece of the future as possible. The market's not shrinking, after all.


I was struck by The market's not shrinking, after all.

In 2000, the personal-use PC's were 'desktops' - now laptop sales are at least equal or higher...
The world is changing - the I.T. market is very close to maturation - near 'topping out' perhaps.

Take for instance the Gartner predictions for desktop/laptop sales in next 12 months (can't remember the link).
They forecast a 10.6% growth in sales volume (to 255+M units) but only 4.x% increase in sales dollars.

SUN have announced their "DataCentre in a Container". Think that through - these are effectively very nicely packaged *mainframes* of MIMD(non-homogeneous) design versus the classic MIMD (SMP) design. You get a 'volume discount' by buying excess capacity - and it comes prebuilt. Your techs should not ever be opening the doors. It really will be "everything in software". And the box could be anywhere within a few milliseconds down the network.

Some organisations will resell capacity, not like the old Processing Bureaus and lately (web hosting), but fractional amounts of a 'box'. Just like leasing office, storage or wharehouse space.

The big change will be corporations adopting the same scale-up/scale-out architectures as the large internet companies - the Internet Data Centre rather than the usual Enterprise Data Centre...

Moore's Law on CPU speed broke in Q1-2003 - but those pesky engineers are still building smaller devices and putting more transistor on a chip - that means more bang for your CPU buck for maybe another 10 years [definitely 2010, but why not 2015].

Scenario:
Organisation buys a DC box. Keeps it for it's economic life (dominated probably by disk size/failures), then replaces it.
The new box *will* have more CPU power, or cost less per processing 'unit', modulo disk pricing.

All of a sudden servers (the things that SUN sells) will be bought in large quanta, kept and replaced in the same large quanta.
And each quanta will feature better "bang per buck". What we've seen in desktops and servers, is that unit price can't be maintained - the price of the low-end units will keep drifting down.

The West's economy is getting close to being saturated with corporate compute power...
Real growth might occur in the developing world - that's a complex equation that includes social and cultural variables.

So will the market for server CPU's keep expanding? I think we are close to maturation of the I.T. industry, within 30-50% of the maximum CPU demand... Which means very close to total sales dollars.

Modulo brand new applications of course :-)
Artificial Intelligence, Knowledge Management or Data Mining/Business Intelligence could actually deliver something useful oneday.

2007/03/22

I.T. in context

Here are Questions, not Answers...
Things that I'd like to explore and have better answers on.

Most of these questions probably don't have permanent 'answers' - each generation, each culture, each industry has to define and redefine them for their mix of technology, political structure and workplace organisation I suspect.

2007/03/20

Quantifying the Business Benefits of I.T. Operations

Objectives (The What)


That "I.T. is done for a Business Benefit" seems axiomatic.

But where's the evidence after 50-60 years of computing? It's not coming out our ears - just the reverse.

Businesses understand the importance of hard data and it's through analysis for marketing, but don't apply the same techniques or management principles to their I.T. Operations.

I'd like to model and quantify the Business Benefits of I.T. Operations across multiple organisations to provide baselines, benchmarks and trend analysis. The impact of all aspects of I.T. is beyond the scope of a single researcher project.

Approach (The How)



Data is fundamental input for analyses. Leveraging what's available means the outputs can be commercially reproduced and aer within the project budget (zero cost).

Three separate data streams will be mined:
  • Historic "ITSM" tool data from multiple organisations.
  • Detailed I.T. accounting information from selected organisations.
  • Primary research in one organisation to collect and report "FTE equivalents provided" by I.T.


[FTE = Full Time Employee. Otherwise, "virtual employees". What head count and cost would be needed to provide similar services with 1965 technology.]

Importance/Value (The Why)



There propositions are to be tested:
  • I.T. is done for a Business Benefit.
  • Business Benefits, tangible or intangible, should be measurable.
  • Organisations these days are dependent on their I.T. Operations.
  • I.T. cuts acros all segments of current organisations.
  • I.T. defines the business processes and hence productivity of the whole organisation.
  • What you don't measure you can't manage and improvve.
  • Improving the effectiveness of I.T. Operations requires reliable metrics.
  • Commin I.T. Reporting Standards, like the Accounting Standards, are necessary to contrast and compare the efficiency and effectiveness of I.T. Operations across different organisations or different units within a single organisation.


I.T. is a cognitive amplifier, it delivers "cheaper, better, faster, more, all-the-same", through the embedding of finely detailed business processes into electronic (computing) systems.

For simple, repetitive cognitive tasks, computers are 1-5,000 times cheaper than people in western countries.

From this ampflication effect, computers still provide the greatest single point of leverage for organisations. The underpin the requirement to "do more with the same", improving productivity and increasing profitability.

Subtle shifts in this whole-organisation amplification ratio (e.g. from 100:1 to 95:1 or 105:1) are impossible for isolated individuals to detect unaided. But they make very large differences to the 'global' organisation output and productivity.

In retail businesse, the gross margin is often around 2.5%. Reducing whole company productivity by 5% will destroy it's profitability, and without any metrics, will be impossible for any management team to identify and resolve.

The few studies of "IT Efficiency" that are available show that IT effectiveness is highly variable and unrelated to expenditure.
My proposition is that "intuitive management" of IT is stretched well beyond it's useful limits and needs to be replaced by evidence-based management.

The value-add to business is two-fold:
  • manage downt he input costs of the I.t. infrastructure and,
  • quantify the "cognitive amplifier" effects across the whole organisation to make informed decisions on optimum 'global' investment/expenditure on I.T. Operations.


[There's the 1990 HBS or MIT study into "White Collar Productivity" - reporting a decrease in the first decade of PC's]

Previous Work (What else)


There is a dearth of published material/research in this area.
The "State-of-Practice" is "NEVER DONE".
There is much opnion in the area, without substantive evidence: e.g. Nick Carr and "Does IT Matter?"

"Commonsense" IT Practitioner approaches, ITIL and COBIT (others?), do not address the measuring and managing of I.T. outputs and their business effects, ultilisation and effectiveness.

The McKinsey report/book on European Manufacturers and their I.T. expenditure versus financial performance shows there is no co-relation between effort (expenditure) and effect (financial peformance).

Jerrry Landsbaum's 1992 work included examples of their regular business reports - quantifiable and repeatable metrics of I.T. Operations phrased in business terms. This work seems entirely disregarded.


Hope to find (The Wherefore)


  • Model I.T. Operations performance within and across similar organisations.
  • generate tools usable within organisations to collect/report their own metrics.
  • Define a set of useful I.T. Operations performance and Business Impact metrics.
  • Model inputs to Business and Business Utilisation/Outcomes.


Report Outline


  • Analyse ITSM tool data. Derive KPI's, Internal Baselines/Trends, Cross-section Benchmarks
  • Annual I.T. Operations Report
  • FTE Employee equivalents - Count and Cost
  • Why IT Matters to the Business.
  • Gaps in Service Management models - ITIL and COBIT
  • Adding I.T. Operations to Management Theories.
  • Advancing I.T. as a Profession
  • Further Work and Research Questions


Execution Phases