I joined the team at the DfE just before the global Covid-19 pandemic.
The role was simple enough, take charge of the technical team of developers and DevOps Engineers and keep the service running whilst simultaneously enhancing the service with new features.
It soon became apparent that the system had been massively neglected for a long time and was struggling to cope with the demands that the DfE were throwing at it.
The system would require 350+ concurrent users in normal times, but would start struggling and failing for up to 10% of users at around the 300 user mark.
Then Covid-19 hit, this in turn caused the user load (due to the government’s response) to spike to 5000 concurrent users.
So my role quickly moved to investigating an already complex system, and trying to find out how we could solve the issues.
With a team of dedicated developers I was able to lead the response to finding out what the short falls were, identifying where in the ecosystem they existed (both evidenced and ones that were theoretical) and how best to fix it.
Within a very short space of time we were soon in a better place, able to handle the daily grind of the schools testing and reporting duties, as well even looking to improve and increase the service offering’s features.
By the time the contract ended, the system was able to handle in excess of 10,000 concurrent users with no evidence of struggle. The real number was never found due to limitations of computing resource not being able to break it.
During all of this I also introduced and enforced new security and operational processes reducing overhead of the existing processes allowing for more things to be released quicker and safely.
There was also an opportunity to start revitalising the ageing user interface which was undertaken by myself and one other in a skunk works team moving parts of the system to ReactJS framework.