Automation is the key to unlocking a great sustainable advantage in companies across all sectors.
Big data can be a big nothing without a strategic automation approach.
On the one hand, we are in a heady period of information wealth, with unprecedented volumes of data on everything from equipment performance to consumer behavior on social media (more than half of all global citizens are on social media). But without thoughtful automation, the use of machines and algorithms to manage, process and analyze available data, your business will miss out on great potential opportunities.
Done right, automation turns “dead” big data into a living and vital resource that you can use to generate value. So it’s no surprise that many companies aim to automate anything that can be automated, as one of Google’s top executives recently said.
To help you think about automation in your business context, I present the three main ways this technology-based business helps you create value.
The first thing automation helps you do is feature extractionor extracting critical information from huge heaps of data. Imagine your organization needs to review patent applications for information on a specific and related technology. You could be looking at thousands or tens of thousands of applications, each with 30 or more pages, for millions and millions of words. But only a fraction of those words and the interrelationships between patents matter, such as what patented technology or inventor qualifications and past patents depend on.
This activity, therefore, like many in the corporate sector, involves a very low signal-to-noise ratio and would require thousands of hours of work to be completed manually, which is far too prohibitive in terms of cost and time. But a machine learning-based algorithm could be trained to find the key information needed relatively quickly, saving significant time and effort. Let’s also assume that in the future you want to search the same or related set of patents but for different information, such as the size of the patent applicant team. You could easily reprogram or retrain the algorithm to accomplish this task, achieving economies of scale and greater returns on your initial investment.
Second, automation helps data control and cleaning. Datasets often need work. There are errors and missing values, anomalies, and sometimes evidence of bias. For example, if an algorithm has been trained to detect characteristics of offenders but uses data only on offenders who have been caught, the algorithm will be biased because it lacks data on offenders who have not been caught, a particular problem for crime. of white-collar workers, who tends to be underestimated. Again, checking for and addressing this vast volume of potential problems is too much to deal with manually. But automation allows for rapid implementation of testing and cleaning tools, once again saving time and creating value.
Third, and this is a big deal, automation is the driving force of the analysis. Yesterday’s simple regression analyzes have become today’s clustering and random forests, fueled by machine learning, whether to understand product users, predict next month’s sales to optimize inventory, or predict the impact of a new advertising campaign. Machine-based automation not only allows you to regularly repeat low-cost standardized analysis processes, but is also capable of identifying non-linear patterns that we humans cannot.
For example, my lab studied over 5 million patents using algorithm-driven analysis to see if we could predict the debut of future breakthrough technologies based on their patent application information. We speculated that the machine would identify future successful patents from application data if the invention had autonomous, “miracle-like” capabilities or ideas. Eventually, the algorithm found the successful patents of the future with high accuracy, but not in the way we humans had imagined. That is, the algorithm did not identify a future successful patent based on its autonomous capabilities; rather, he identified successful patents based on whether they were part of a bunch of affiliated patents that together could solve specific problems in combination that no single patent could have solved alone.
For example, ultrasound technology had a major impact on healthcare several years after its first presentation, enabling non-invasive imaging and treatment of physical conditions such as kidney stones and even some cancers. But that advancement would have been impossible without small-scale inventions beyond basic technology: applicators, static-reduction processes, specialized medical electrodes, and forceps that were developed independently of ultrasound technology but critical to its application of success in medicine. Our automated analysis reliably recognized the existence of these related patent groups in over 5 million patents, from healthcare products to the latest golf ball technology, and that these groups correlated with the likelihood that patents in they contained would become the dominant technologies of tomorrow: an inference not previously appreciated.
My Northwestern colleague Andrew Papachristos used similar analyzes to show that police corruption in Chicago stems not from a few “bad apple” agents, but from a network of connected cops acting in bad faith; his work allows for early detection of such problems.
I hope I have clarified the mutually reinforcing benefits of automation and how it can help you transform data into broad and sustainable value. In fact, the more data you have, the more automation you need; once you have strong automation skills, you can collect and leverage even more data and the cycle continues.
Bottom Line: Automation is an increasingly critical capability and can be critical to your business’s short- and long-term performance. But it’s important to understand how it generates value and to take steps to mitigate its real drawbacks, for the good of your business and the broad community in which it operates.
In the second part of this article, I’ll discuss the three main downsides of automation (explainability, transparency, and cost) and how to address them.