SteveJ-on-IT: 2007-06

2007/06/07

Why IPTV won't work in Australia - ISP Volume Charging.

It's all about cost...

A phone line costs $25-$35 just to have. Add ADSL plans on top.

ADSL plan costs vary with speed and pre-paid download volume.
Reasonable TV needs around 4Mbps these days - the highest speed and cost plans.

Turnarounds

My previous post on 'Digging Out', a methodology built from experience in a number of turnarounds, can't stand alone without some justification:
What have I done to be credible?

Here's a sample taken from a version of my CV:
Completing large business critical projects on-time and on-spec. In complex political environments achieving all client outcomes without formal authority.

ABN online registration system - Business Entry Point, DEWRSB.
Y2K conversion - Goodman Fielders (Milling and Baking)
Y2K remediation, CDM
DFAT (ADCNET Release 3, ADCNET build/release system)
TNT
Unisys/Customs (EDI & Finest/Nomad)
CSIRO - all Australian daily weather database
Diskrom ($350k project income in a year)

ABN registrations:
The ATO paid for and ran the project. The software contractor was combative and unhelpful. The environment complex - around 6 different organisations for this one project, and another 10 projects. To get anything done required a huge amount of effort, negotiation and discussion.

The software contractor hadn't built any load monitoring and response time facilities into the system nor made any provision for capacity planning and performance analysis.

On my own recognisance, I designed and built some simple but sufficient tools, did the performance analysis - and accurately predicted the final peak load - 20 times the design load, and after diagnosing a catastrophic failure mode, designed a simple solution.

This 100% availability over 3 months was not accidental and directly contributed to 600,000 registrations of 3.3M being done on-line (around 15 times the estimate) and the site not attracting bad press like other aspects of the operation. Definitely a high-profile site.

The software contractor had to be dragged kicking and screaming all the way through the process. But I got the outcome the client needed - the site kept running with good response time on the busiest day. Some years later I analysed the performance data from the time and uncovered a few more nascent problems that we'd skirted around.

Goodman Fielders (Milling and Baking)
This was a Y2K project - they needed to migrate a legacy (Universe/Pick) application to a Y2K compliant platform and simultaneously upgrade the software and remerge all the 6 variants.

The application ran their business - ordering, tracking, accounting - the whole shooting match.
And for accounting systems, the Y2K deadline was the start of the financial year - 1-July-1999.

The work got done, I contributed two major items: deferring non-accounting conversion and moving the new single system to a properly managed facility.

DFAT (ADCNET Release 3, UNCLgate, ADCNET build/release system)
This was bizzare, I ended up sitting 20' from the desk I used when I worked on the system being replaced ('the IBM'), when it was being commissioned.

ADCNET failed and went to trial with the developer losing on all counts. It's worth reading the decision on 'austlii.edu.au' [Federal Court]. That account is certainly more detailed than the ANAO report. It was obvious in 1995 that the project could never deliver, let alone by the deadline. So I did my tasks and went my way.

To be called back again to degug and test an email gateway between the IBM and ADCNET (R2) for Unclassified messages. This was the first time I realised that being better than the incumbent staff in their own disciplines was 'a career limiting move'. Showing experienced, supposedly expert, programmers how to read basic Unix 'man' pages and act on them was a real lesson. A major problem that caused queued messages to be discarded was found and fixed by my testing - along with a bunch of the usual monitoring, administration and performance issues being solved.

I was called back again to help with the Y2K converstion of ADCNET (departmental staff were doing it). The system was over a million lines of code and the release/development environment bespoke. And required maintenance work on the dependencies/make side of the software had never been done. A few months part-time work saw all that tamed.

TNT
Went for a year as an admin. Did what I could, but they were past redemption... Bought out by the Dutch Post Office (KPN) soon after I'd arrived.
Presented a paper at a SAGE-AU conference detailing my experience with their 'technical guru' - who'd considered himself "World's Greatest Sys Admin". Google will find the paper for you.
It was so woeful, it defies description.

Unisys/Customs (EDI & Finest/Nomad)
In early 1995 was called in to replace a SysAdmin for 8 weeks on the Customs "EDI Gateway" project. The project and site were a disaster - so much so that Unisys listed it as a "World Wide Alert" - the step before they lost the customer and hit the law courts.

In two months the team stabilised the systems, going from multiple kernel 'panics' [a very bad thing] per week, 8-10 hour delays in busy hour and lost messages - to 100% uptime over 6 systems, 1-2 second turnarounds and reasonable documentation, change processes and monitoring/diagnosis tools. The Unisys managers were very appreciative of my efforts and contributions. This same sort of chaos that was evident in the 2005 Customs Cargo clearance System debacle. [The 'COMPILE' system ran on Unisys 2200 and was being replaced over a 10-year period. It was the back-end for the EDI systems I worked on.]

So much so, that I was called back for another few months to stabilise another system running ADABAS/Natural legacy applications that provided the Financial Management & Information Systems and Payroll/Personnel system. Another high-profile, critical system.

CSIRO - all Australian daily weather database
The research scientists on the project I worked for created some tools to analyse weather data - and had found a commercial partner to sell them. The partner was not entirely happy due to extended delays and many unkept promises. I'd been told that to buy the entire dataset - a Very Good Thing for the commercial partner - was not affordable, around $20,000 for the 100 datasets from the Bureau of Meteorology. When I contacted the BoM, they not only provided the index in digital form for free, but the whole daily datasets would cost around $1,750. I scammed access to another machine with the right tape drive, wrote scripts and did magic - and stayed up overnight reading the 70-odd tapes. In pre-Linux days, there was no easy way to compress the data and move it around.

The whole dataset as supplied was 10Gb raw - and I only had a 1Gb drive on my server [$3,000 for the drive!].

It took 6 weeks to fully process the data into their file format. And of course I had to invent a rational file scheme and later wrote a program to specifically scan and select datasets from the collection.

The Commercial Partner got to release the product at the ABARE 'Outlook' conference with a blaze of CSIRO publicity. Don't know what the sales were - but they were many times better.
The research scientist got a major promotion directly as a result, and I was forced to leave for having made it all possible.

Diskrom ($350k project income in a year)
In under a year I learnt SGML, hypertext and enough about handling large text databases [1991 - before the Web had arrived], took over and completed 3-4 stalled and failing projects, migrated datasets and ssytems, designed tools and file structures/naming conventions and completed the first merge of the Income Tax Assessment Act with the Butterworths commentary, speedup processing of a critical step by 2,000 times - all of which directly contributed $350,000 in revenue [apportioned to my effort] - or around 12 times my salary.

So it's natural that everybody else in the office was given a pay rise, I was told that I was technically brilliant but not worthy of a rise and one of the 'political players' was promoted to manage the group. With a number of other key technical 'resources' I left to pursue other avenues.

Diskrom was shut down just a few years later when a new chief of the AGPS (Aus. Gov. Printing Service) reviewed the contract and decided they were being ripped off. They'd provided all the infrastructure and services, with the commercial partner paying for staff and computers - and despite lucrative contracts and overseas work, never seen any return.

Basis of Quality - What I learnt in my first years at work

On the 17th of January 1972, with 70+ others, I started as a cadet at CSR. This early work experience set the stage for how I approached the rest of my life at work. I'm come to the conclusion that these early formative experiences are crucial to a persons' performance and attitude for their entire working life. Breaking these early patterns can be impossible for some.

The cadets of my intake had the privilege to have unrivaled experiential life lessons in quality, safety, team building & working, skills development and personal responsibility.

The lessons I took with me into my career:

Quality workmanship and precision techniques
Formal quality techniques and analyses
Following precise analytical processes
Managing high work loads and meeting tight deadlines
Responsibility for high-value processes/outcomes
Satisfying stringent customer requirements
Respect for, coping with and managing work in dangerous
environments - and experience in handling/responding to hazardous
incidents.
And doing time sheets and "filling in the paper work" as a natural part of things.

My first 2 years of full time work as a cadet chemical engineer at CSR Ltd saw me personally responsible every day for $10M’s of product – performing sugar analysis for the whole of the Australian sugar crop. At the time the price of sugar was at an all time high. Each of us played our part in process - no one person did it all, but one person could botch the work of a whole team, or even a whole days' work.

At these NATA certified laboratories, we were trained in Chemical Analysis - but also safety and quality methods - with lives and ‘real money’ at stake.

Routinely large quantities of flammable and explosive alcohols, and highly toxic and corrosive acids were used and safely disposed of.

Deadlines were tight and fixed – each day samples for 50-100,000 tonnes of raw sugar and full analysis had to be delivered with a very high degree of accuracy and certainty that same day.

Speed, precision and absolute dependability were instilled in us alongside a clear knowledge of the value and consequences of our work - and mistakes.

We were tutored in analytical techniques, trained in reading and following exactly processes, statistical analysis, fire and safety (OH&S) skills, certified first-aid and our duties and responsibilities to our clients - the sugar producers.

It was expected that "people make mistakes" - the first rule of any precise analysis is the error range (+/- ??). The system was designed consistently to produce accurate, repeatable results with a very low margins of error. Calibrated samples were fed through the process alongside multiple samples and any 'repeats' that had failed the checking process. The performance of individual analysts and groups was tracked and monitored. People were assigned to tasks suited to their particular talents - based on objective results, not presumption or basis.

We all acquired robust team working skills. Along with how to do the boring things like time sheets with exactitude.

The next year I spent working as an Analyst in the Pyrmont Sugar Refinery.

Lots of routine and tight deadlines - and the same attention to detail, importance of results and a necessity to understand 'the big picture'.

There'd been an analyst, an ex-secretary, who'd dropped her (mercury) thermometer into the 'product stream'. She hadn't realised it was a problem, and a few hundred tons of sugar had to be thrown away and the process shutdown for cleaning. A tad expensive - many more times her wage and the cost of training and supervision to prevent.

My routine work led to uncovering a systematic problem with a very simple and preventable cause. We had a power station on-site - a real live coal burning, electricity and high-pressure steam (50 atmosphere?) producing plant. It used enough coal that when a fire in the furnace hopper caused a malfunction, the whole of the Sydney CBD was covered in a thick pall of black smoke - which made the TV news.

The plant feed-water was supposed to be closely maintained near a neutral pH - not acidic. A test station was on the factory floor, with a reagent bottle next to it. The reagent ran out, so I made a new batch - only to find that the measurements were now grossly wrong. The bottle was stored without a lid and slowly evaporated and concentrated. Over a couple of years, the pH measurement had slowly drifted and the water pushed way out of spec.

Serendipitously a major explosion or equipment failure was avoided. Replacement of the power-station, and shutting down the whole of the Pyrmont facility dependent on it for a couple of years, would've seriously impacted the whole company.

Digging Out - Turning around challenged Technical Projects/Environments

Something I wrote in 2002:

‘Digging Out’ - 7 Steps to regaining control

This is a process to regain administrative control of a set of systems. It can be practised alone or by groups and does not require explicit management approval, although that will help.

‘Entropy’ is the constant enemy of good systems administration – if it has blown out of control, steps must be taken to address it and regain control. The nature of systems administration is that there is always more than can be done, so deciding what not to do, where to stop, becomes critical in managing work loads. The approach is to ‘work smarter, not harder’. Administrators must have sufficient research, thinking & analysis time to achieve this – about 20% ‘free time’ is a good target.

This process is based on good troubleshooting technique, the project management method (plan, schedule, control) and the quality cycle (measure, analyse, act, review).

The big difference from normal deadline based project management is the task focus, not time. Tasks will take whatever time can be spared from the usual run of crises and ‘urgent’ requests until the entropy is under (enough) control.

Recognition

Do you have a problem? Are you unable to complete your administration tasks to your satisfaction within a reasonable work week? Most importantly, do you feel increasing pressure to perform, ‘stressed’?

Gather

The Quality Cycle first step is ‘Measure’. First you have to consciously capture all the things that 1) you would like to do to make your life easier and 2) take up good chunks of your time.

The important thing is to recognise and capture real data. As the foundation, this step requires consistent, focussed attention and discipline.

The method of data capture is unimportant. Whatever works for the individual and fits naturally in their work cycle – it must NOT take significant extra time or effort.

Analyse

Group, Rewrite, Prioritise.

Create a ‘hard’ list of specific tasks that can be implemented as mini projects that can be self managed. Individual tasks must be achievable in reasonable time – such as 1-2 days effort. Remember you are already overloaded and less than fully productive from accumulated over stress.

Order the list by 1) business impact and 2) Daily Work-time gained.

The initial priority is to gain some ‘freeboard’ – time to plan, organise and anticipate, not just react.

Prioritisation can be done alone if there is not explicit management interest.

It will surprise you what management are prepared to let slide – this can save you considerable time and angst.

Act

Having chosen your first target, create time to achieve it. This requires discipline and focus. Every day you will have to purposefully make time to progress your goal. This means for a short period spending more time working or postponing

Do not choose large single projects initially, break them into small sub projects.

When you start, schedule both regular reviews and a ‘drop-dead’ review meeting – a time by which if you haven’t made appreciable progress on your task to review

Review

How did it go? Did you achieve what you wanted? Importantly, have you uncovered additional tasks? Are some tasks you’ve identified not necessary.

If your managers are involved, regular meetings to summarise and report on progress and obstacles will keep both you and them focussed and motivated.

‘Lightweight’, low time-impact processes are the watchword here. You are trying to regain ‘freeboard’, you do NOT need additional millstones dragging you further into the quagmire.

Iterate

Choose what to do next. If you’ve identified extra or unnecessary work items, re-analyse.

When do you stop this emergency management mode? When you’ve gained enough freeboard to work effectively.

A short time after the systems are back in control and you are working (close to) normal hours, you should consider scheduling a break. You’ve been overworking for some time and have lost motivation and effectiveness. A break should help you freshen up, gain some perspective and generate ideas for what to do next.

Maintain

What are you and your managers going to do to keep on top of things? How did you slide into the ‘tar pit’ in the first place? What measures or indicators are available to warn if this repeats.

How will you prevent continuous overload from recurring?

2007/06/02

Commercial Software is Good - because you Have Someone To Sue.

A friend came back from a ITIL Practitioners course with an interesting story:

The course was mostly about the complexities of handling Commercial Licenses, no two of which are the same.
The course provider made the not unexpected statement about Open Source:
"Don't use it because you have nobody to sue".
And they went onto ask "Why use OSS?"

His response was: "Because it's best of breed, especially rock-solid utilities & tools".
And they continued to not listen...

This note is NOT about that particular mindset [Risk Avoidance, not Risk Management].

I'd like to give him, and other technical people like him, a "slam dunk" one-liner response to each of these questions:

Why OSS - because it's best of breed!
Why use a bug-ridden, poor functioning piece of commercial software when the best there is is rock-solid, secure & Free and Open?
Not only do you remove the need to sue *anybody*, you get the best tool for the job and know it will never be orphaned, withdrawn or torpedoed.

Or you may be held to ransom with enormous support costs - the Computer Associates model of buying 'mature' software and raising the support costs to turn a profit until the customer base bails out.

Using rock-solid OSS apps. means you are unlikely to need to sue anybody. It "just works", not "works just".
And if you have concerns over "prudent commercial Risk Management",just hire an OSS support organisation who's got both "Professional Indemnity" and OSRM insurance.

And I tried to quickly find *two* significant lists for him:

widely used open-source software [Apache, Samba, Perl-PHP-Python-Ruby, gcc/make/cvs/subvers, Eclipse, ...]

The caveat on this list, is that I need estimates of the market share or extent of use of the software. Viz: For apache, the netcraft survey:
OSS support organisations. [remember Linuxare ?]

If you have pointers to that data, I've love to hear from you.

Who can we sue? Or - the Myth of Riskless I.T. Management

This started as a conversation on an Open Source list - on how to respond to people that assert:

"We can't use Open Source because there's Nobody to Sue".

Why "IT Service Delivery and Management" should be an Academic Discipline

IT Service Delivery is where "the rubber hits the road". Without sufficiently capable and reliable IT infrastructure every other piece of the IT equation - Architecture and Design, Business Analysis, Project Management, Software Engineering, Software Maintenance, Information Management, Data Modelling. ... - becomes irrelevant. All other effort is wasted is the services don't run for the users.

All the potential benefits of IT, the 'cognitive amplifier' effects and leveraging people's skills and experience, rely on the delivery of IT Services.

Where is the academic discipline that:

Defines IT Service Delivery (and it's OAM components - Operations, Administration and Maintenance)
Provides a framework to compare and audit the performance of IT Service Delivery in an organisation to benchmarks relative for the industry.
Defines absolute and relative performance of individuals, teams and IT organisations.
Defines and explores performance, effectiveness, utilisation and 'service benefit conversion' metrics?

If the goal of IT/IS is to "deliver a business benefit" - but the benefits aren't trivially measurable, the deep knowledge/experience of the discipline of Marketing can be brought to bear. The first step in every project is to define 'the desired benefit', how it will be measured and reported, and the break-even or cancel point.

The academic discipline that informs practitioners, management and the profession on how to actually realise the benefits of IT/IS systems in practice.

ITIL and ISO 20,000

"ITIL" (IT Infrastructure Library) was first formulated in 1989 by the UK OGC (Office of Govt. Computing) to provide a common language and framework or conceptual model for IT Operations (now 'service management') - necessary for the process of tendering and outsourcing.

In 1999 it was re-released ('version 2') and new books written.

Mid-2007 sees the release of 'version 3' - another rethink and rewrite.

In 2000 ITIL spawned a British Standard, BS15000, revised and updated in 2002. In Dec 2005 BS15000 was adopted internationally as IEC/ISO20,000. It sits alongside "Information Security Management" ISO17799 [formerly BS7799:2002] and "Information Security" ISO27001. BS25999 address "Business Continuity".

Forrester Research in early 2007 reports (in "CIOs: Reduce Cost By Scoring Applications" by Phil Murphy) that 'IT Service Delivery' (Forrester calls it "lights on" operations and maintenance) is accounting for a rising percentage of IT budgets. Referenced in "Maintenance Pointers".

Reasons for a discipline of IT Service Delivery and Management

The Forrester survey of October 2006 reports IT Service Delivery consumes 80% of more of IT budgets - up from 60-65% ten years ago.
100% of the User Utilisation of IT Software, Systems and Services is mediated by Service Delivery. It's where "the rubber hits the road".
IT is useful in business because it's a cognitive amplifier - it amplifies the amount of useful work that people can perform/process. IT provides "cheaper, better, faster, more, consistent/correct".
Business and Government are now dependent on their IT. We've crossed the event horizon where [in the 'developed' world] it's possible to resile from IT systems.
IT is arguably still the greatest single point of leverage [staff effectiveness amplifier] available to organisations.
Service Delivery is the anchor point for "What Value does IT Deliver?"

Where the 'IT Services' discipline belongs

There are two requirements for a faculty teaching 'IT Services' or 'IT Service Management'

Business and Management focus, and
Ready access to large, complex "IT Services" installations

Traditional computing and IT faculties are focussed on the internal technical aspects of computing. 'IT Services and Management' is about delivering and realising Business Benefits - the managerial focus. The necessary disciplines and knowledge/expertise already exist in Business/Commerce/Management Schools - and are somewhat foreign to traditional CS/ISE/IT Schools.

Canberra, accounting for 20% of the IT expenditure in Australia, is well placed to initiate 'IT Service Delivery and Management' in this country.

SteveJ-on-IT