Coping with high-stress system events

David Swayne, CIO at London South Bank University outlines how with visibility of all digital systems he can now be proactive not reactive

An IS professional’s life would be easy (if a trifle dull) if system usage was the same every day. But of course it’s not. Throughout the year, universities and other HE institutions experience a series of ‘spikes’ of demand on their systems, madcap days when usage goes through the roof and the systems’ processing, network and storage capabilities are tested to the limit.

These include Clearing in August, especially the organised mayhem of the first day; enrolment day and the first day of term, when every student suddenly wants to look up their timetable, and results day, when everyone is anxious to see how they’ve done.

These are days of high profile as well as high demand. And with Clearing accounting for up to a third of admissions to some institutions, a major outage at peak time could mean either failing to fill all the places resulting in a significant shortfall in revenue, or awarding too many places and being fined for exceeding the government cap.

On our first day of Clearing, for example, we can receive 1,200 phone calls – a whole month’s worth – so we build a 50-seat call centre in one of our computer labs specifically for the event.

On days like this when demands are highest, what we need most is visibility. We need to see how the systems and their components are performing, so we can pre-empt any issues with proactive management rather than reacting to them once they’ve occurred.

We had experienced a few issues with clearing in previous years which impacted upon the service we were able to deliver to prospective students. So in 2013, we set up a pilot of netEvidence’s Highlight service in early August, a few weeks before A-level results day.

The real surprise was that it only took a few hours to set up Highlight – we could then see clearly how the various servers and network devices were coping such as cpu and memory usage, disk space, website traffic, network data volumes and user response times.

Then on the first day of Clearing, I could see by early afternoon that disk usage was up to 95 or 96%. We could potentially have run out, so I was able to be proactive and allocate more resources to prevent this. As a result the system kept running at full capacity.

With Highlight, we now have an end-to-end view of entire transactions, giving a joined-up picture of what real users are experiencing, combined with detailed information on individual system components and their actual or potential impact on the performance of the whole system. At times of peak demand such as Clearing or enrolment we now monitor Highlight particularly closely.

The IS team aren’t the only ones who use Highlight to monitor our systems’ health. Our senior management team also want to be assured that all is well. They can look at Highlight’s colour coded Service Tiles to get a top-level view and make sure there are no issues with their specific digital service, while we concentrate on the detail.

When systems (and people) are stressed, hard facts are very welcome, to the IS team and management alike.

* To learn more about how to cope with Clearing and other high-demand events, as well as more of LSBU’s experiences, download netEvidence’s report Are You Ready For Clearing? at