For a long time, many companies have used process improvement strategies—ideas such as Six Sigma, Lean, Kaizen, , Re-engineering, and a slew of others—to streamline operational processes, and they have done so with great success. Process improvement and continuous improvement initiatives, which often rely on the help of internal audit and are based on a healthy dose of metrics, have succeeded in cutting waste, speeding production times, creating efficiency, simplifying processes, lowering costs, improving safety, and providing many other benefits.

Yet many companies have not been as successful at applying these process improvement strategies to technology-based processes. That’s because they often see them as either too complex and don’t understand them well, or they chalk up inefficiency as the price of security. Managers don’t have a good idea of what is happening in those complex lines of code or inside the “black box” so they don’t bother to try to improve the processes they run. In manufacturing, operational waste and production problems are easier to see, identify, and remedy compared to a technology process where so much is happening inside a network or piece of complex code and work is not easily observed.

It doesn’t have to be that way. In fact, there is a great deal of the “human element” in technology processes that can be improved. There is also a lot of waste and inefficiency that gets built-in along the way. There is no reason we can’t approach technology-based processes in the same manner and with the same techniques and rigour we use to improve operational processes. And internal audit can be a great catalyst to spark such an effort and aid in its progress to add value. We also need to rethink the way we use metrics to assess technology-based processes.

Applying operational process thinking to technology-based systems can have a wondrous effect. We can streamline technology processes to make them simpler, faster, cheaper, more efficient, and, yes, actually more secure.

Quality and Cybersecurity

What is quality in a cybersecurity process? Is it just simply checking to see that various systems have controls in place? Perhaps, but isn’t there more to quality than just that? In the fast-paced cybersecurity world, risk is not only prevalent in securing various technology systems but in operations as well.

By taking an outsider’s view of a transaction (email message, online sale, electronic ledger input, etc.) and following it into and through the process, breaking it down into minuscule steps, an auditor (IT auditor or internal auditor) can uncover the risks hiding in plain sight. An auditor doesn’t have to be an expert in the process to identify the holes in it. By simply asking what happens at each step and thinking about what could go wrong within it, the auditor working with the process owner can uncover a tremendous amount of operational problems that easily put the organization at risk. And who’s to say that the ideas of Lean—where any activity or step that uses resources but doesn’t add value is eliminated—can’t also be applied here.

Bad things, such as the wrong information posted to the wrong account, inadequate access controls, poor password habits, or a breach occurring due to the potential that an employee falls victim to an obvious phishing scam, are suddenly brought into focus. These are the fundamental problems that a cybersecurity organization faces each day that go largely unnoticed. Alerting management of those risks is absolutely critical. What’s more, many of the problems are human-related, such as poor employee habits or inadequate training, rather than faulty code or improperly configured machines.

Going Through the Motions

This experience is hardly the norm. A report issued by Deloitte last year, the 2018 Global Chief Audit Executive Research Survey, found that “only 33 percent of chief audit executives (CAE’s) believe that their internal audit function is seen in a positive light.” This comes as no surprise to me, based on my experiences in various companies, and will be a familiar notion for many internal audit leaders. From my experience, getting called out during an audit can involve not a small amount of shame for the manager, since the implication is that they overlooked or ignored something, immediately followed by the sinking feeling of how to resolve the findings within a specified time frame and with a team that is already stretched thin just trying to do their jobs each day.

The result of this typical situation is remediation solutions that are not thought through completely, but are only done to satisfy the audit report. In other words, the findings are remediated for the audit report, but the business now suffers from having to take more steps in their work to complete a transaction. When asked why the process is so slow, no one can really say why except that the “busy work” they must now do to satisfy the internal auditors is gumming up the works. Even worse, that complaint is often followed by the refrain: “They just don’t get it!” When it comes to technology-based processes, these problems are only magnified.

Cybersecurity and Operational Challenges

In the world of Cybersecurity and tech-based transactional processes in general, the approach to improving operations is decidedly different than in manufacturing. In manufacturing, operational waste and production problems are easier to see and remedy compared to a technology process where so much is happening inside the hardware or software code. As with manufacturing, the focus is on the flow of a transaction from front to back, looking for areas that slow the process down or are confusing or challenging to work through. It’s easier to talk to employees on the line and ask them where they see problems or potential for waste, inefficiency, and mistakes. Where there are operational challenges, there is almost always a greater chance of something going wrong.

In technology we tend to accept more wrinkles in the systems. Moving beyond accepting inefficiency requires a change in mind-set. So let’s look at how we can borrow some of these process improvement and continuous improvement strategies from traditional operational processes and apply them to tech-based processes.

Move to Improve

Process Improvement can help address problems with identified risks and implementation of controls, but it can do more than just that. It is important to remember that in process improvement, it’s not about the implementation of tools, but rather a holistic systems-thinking approach introduced by W. Edwards Deming and Walter A. Shewhart. This approach gets us to why we have operational gaps in the first place and leads us down the path to operational improvement and risk reduction. In my opinion, the risk is introduced in a variety of ways by managers who take a siloed approach to manage their processes. Optimizing their own operations with technology and business rules designed to make their functions operate more smoothly and efficiently does not always work out the way it was intended for customers, both internal and external.

If the goal is to reduce risk in its many forms in an operation, the quickest way to fail is to start trying to improve processes right away. A more successful approach is to instead focus on people and data that ultimately lead to lasting success. The following approach has been proven to work in cybersecurity environments:

  1. Understand what operational waste and risk looks like
  2. Map processes to help improve understanding of what those processes really look like
  3. Design and implement a measurement system to reflect what happens in each process and highlight barriers to efficiency
  4. Rework the process to overcome inefficiencies, remove wasteful steps, and improve operations where needed through continuous improvement of people

Never Ending Story

Process improvement is important, but it can never be a one-time exercise. Internal audit and continuous improvement (CI) teams are typically aligned from the perspective of reducing risk and improving operations on an ongoing basis. From a CI perspective, having internal audit identify various risks helps open the door for them to begin working in an area that is normally resistant to change. Compliance becomes more manageable, for example, as operational risks and controls are dealt with appropriately. Internal audit can recommend that the CI team engage in improvement activities and report back any issues to ensure compliance. This gets operational experts closer to the problems quickly and helps eliminate the findings and observations that have been recorded. Additionally, this approach brings a different perspective to the senior leadership team as they get a chance to not only see operational risk mitigated, but also operational value unlocked. Of course, this relationship isn’t limited to just cybersecurity operations but across the entire enterprise.

Technology processes are especially vulnerable to the “set it and forget it” mentality. How often do we revisit technology-based processes to ensure that they are organized for optimal performance given the current risk environment or to ensure that changing business environments don’t demand tweaks to a given process? We should be continuously monitoring for outdated software; duplicate systems to achieve the same goals; opportunities for streamlining by, for example, standardizing on one top-performing vendor; the emergence of new, more efficient, systems and apps, and other improvements. Indeed, technology-based processes simply must be subjected to continuous improvement techniques, particularly since the shelf life of nearly all technology components and elements is shrinking all the time.

Simplicity Synchronicity

Even with technology processes, this quote from management guru Tony Robbins holds true: “The enemy of execution is complexity.” Applying end-to-end process thinking to a security or network operations center, with an eye toward simplifying processes, we will find operational issues in areas such as siloed technology and processes.

Another possible area for improvement and risk reduction is the peer-review process. In high-volume operations such as NOCs (network operations centers) or SOCs (security operations centers), 100 percent code review processes are not a guarantee for catching all mistakes. Code can be posted to an incorrect account as easily as emails containing sensitive information can be sent to the wrong customer—even with multiple pairs of eyes reviewing it. The danger is not the technology, but humans interacting with technology in a very complex process.

Sure, some tech-based processes and operations, including cybersecurity, are complex and no one is suggesting oversimplifying them. But they don’t have to be needlessly complex. In fact, built-in complexity can make systems less secure, rather than more secure, since the mistakes are likely to happen where employees are interacting with systems they don’t fully understand. These tech-based processes should be constantly reviewed with an eye towards eliminating unnecessary steps or simplifying those that have become overwrought. If redundancy is required, there should be a sound case for it and not redundancy just to have it.

In his book, The Checklist Manifesto: How to Get Things Right, author Atul Gawande states: “In a complex environment, experts are up against two main difficulties: One, the fallibility of human memory and attention for mundane, routine matters. And two, people can easily lull themselves into skipping steps even when they remember them. This has never been a problem before, people say. Until one day it is.” Does this sound familiar?

Measure for Measure

Most processes that I’ve reviewed over the last 15 years measure one component of their operation very well, while ignoring other critical metrics. In the case of SOCs or NOCs, this means that the focus is on Attacks Successfully Mitigated or similar types of metrics. While those types of metrics can be important and should continue to be measured, the goal of metrics is to help surface as many operational problems so that they can be resolved. When beginning process improvement in a SOC or NOC, it is difficult to figure out specifically where problems are and what to solve. Starting with an effective measurement system helps to surface issues that people can identify immediately and then “own” through resolution. In a cybersecurity environment, the challenge is to “level the playing field” with transactions in a way that measures each team equally.

Productivity, for example, is a useful metric in a SOC or NOC environment, but isn’t usually measured or measured effectively. As an example, the scatter plot chart shown below was used to measure a “Tier 1” alert triage process in a global SOC operation. The teams were responsible for responding to all email alerts, handoffs from other departments, and inbound phone calls. Productivity had never been managed in any of the SOC operations prior to this, nor were managers aware of how to do this. Prior to implementing this metric, we grounded everyone within the triage process with the process maps to make sure that everyone was on the same page with what would be documented.

This approach was impactful as it took the “people” out of the metric and gave management and the teams a way to think about why their processes were or weren’t performing to the level they should be at. Everyone agreed that speed was a critical factor to measure and that they needed to perform better. But it did another thing that neither an audit nor any other process map could do: it changed people’s perceptions about their process and encouraged them to begin surfacing operational problems.

The teams within the process began to complain about the challenges they had in completing their jobs, both technical and process-related. Those complaints surfaced problems that could be fixed or bumps that could be ironed out—exactly what we were looking for. This translated directly into risk reduction as the entire department began to focus on improving the process. The most ironic part of the feedback that we received was that most problems focused on processes and the interaction with technology rather than the technology itself.

Problems such as how runbooks were structured, multiple issues with handoffs, and even issues with the use of UTC vs. Standard Time were identified. The number of issues the teams reported was eye-opening. These are the little problems that slow transactions down and increase the likelihood of something going wrong. The end result? A 25 percent increase in productivity within three months. Imagine if that type of thinking were transferred into other parts of a cybersecurity organization? A lot of risk could be managed better or reduced.

Metrics like productivity are powerful motivators when management and people in the process can see how their efforts are reflected. This is where basic process improvement tools such as standardization of processes are useful to implement. Standardizing processes is more than just ensuring there is a repeatable process, although that is where people usually start and finish. Process standardization also serves to help people within the process identify when something isn’t right, so they can correct it before the transaction enters the process. Other benefits of standardization are that transactional flow and inter-process communication are improved as well. As Gawande states in his book, “If you miss just one key thing, you might as well not have made the effort at all.”

It’s true that internal audit teams can only focus on smaller, but critical, portions of the enterprise at any given time to identify and ensure risk in its many forms is reduced or managed more effectively. Aligning with a process improvement or continuous improvement organization, however, is a way to ensure that more eyes are focused on streamlining those processes and reducing risk. In the end, the process owners, employees, managers, customers, and ultimately shareholders, will all be better off.

 

Mark Abrams is a certified LSS Black Belt with over 15 years of experience in improving operations across industries including hospitality, insurance, manufacturing, IT professional services, and cybersecurity, and is managing director at Polaris Process Innovations.