Professor of Econometrics and Statistics from 2008-2018 at the University of Chicago Booth School of Business, Matt Taddy developed their Data Science curriculum. He reveals how to use machine learning to understand your customers, frame decisions, and drive value.
Artificial Intelligence is changing how business operates. Technology and automation have long impacted how people do their jobs, but AI is going to be extremely useful in fields that have not traditionally made heavy use of automation: finance, product design, and executive leadership. Machine learning and reinforcement learning are able to automate steps of a decision making process, and the ranks of business decision makers are going to need to adapt to work with these new AI technologies.
Consider ‘AB’ testing, which over the past 20 years has become a key step in decision making at many firms. AB testing is simply what statisticians have been doing since the First World War: trying a few different treatments and measuring their effects. In its modern incarnation, anyone with a website can randomize different content (treatments) across users and see what happens.
It is good to evaluate your ideas against data. But AB testing is flawed as tool for complex decisions. The problem is that configuration of treatments A and B (and C and D, etc.) is highly manual. Suppose that you are designing a web tool for making restaurant reservations (I’ve never worked at OpenTable, so this is all imaginary). There are several actions you must take to specify layout and content. It will be impossible to look at all possible configurations of these actions, so you will choose a few favorites for testing. After the test, the best option will be launched but there will be a desire to make further improvements – say to optimize for time-of-day or to highlight specific customer reviews. So you configure some new treatments and run another test.
This process might sound reasonable, but with more than a small number of design options it turns into an exhausting slog of iterative experiments. Each treatment is a complex policy, specifying the full set actions for your website in every possible customer scenario. A manually designed policy will consist of a complex set of heuristic rules (‘show breakfast restaurants before noon’ or ‘show the most recent review’) and it will be impossible to design a uniformly great policy in a single treatment. That impossibility dooms you to incremental policy tweaking, long waits for AB tests to complete, and little direction on what to do in each new treatment. You’ll be very slowly throwing stuff at the wall and seeing what sticks.
How do you get out of this slog? The answer is to use Machine Learning (ML) to automate and accelerate the process. ML can learn complex patterns in data and use these patterns to predict future observations. Deep neural networks, random and boosted forests, and regularized regression are common flavors of ML. While the scope of an individual ML algorithm is limited to relatively ‘dumb’ prediction problems (i.e., predicting a future that resembles the past), fast scalable ML is the electric motor that powers our new AI. Modern intelligent systems consist of many individual ML algorithms, each solving a simple prediction problem, that together combine to automate a complex process.
There are two ways to automate the design of business policy using ML. The better option is to use ML tools in your policies. Instead of using fixed rules, our reservation website could deploy an ML algorithm to predict the most commonly booked restaurants at different times or the most widely read reviews. These ML algorithms are trained ‘offline’, looking at past customer behavior. The predicted most-likely-picks are automatically served within constraints on a web-page layout that you have designed. Your treatments become combinations of ML prediction algorithms rather than a tangles of heuristic rules.
The best option is to move beyond AB testing to Reinforcement Learning (RL), which replaces the manual iteration between AB tests with an automated sequence of mini-experiments. You again design policies that are combinations of ML prediction algorithms, but each policy is now accompanied by a strategy for data acquisition. Instead of serving most-likely-picks from offline ML, you run a mini-experiment and randomize what you serve amongst a set of somewhat-likely-picks. The ML algorithms are constantly re-optimizing in response to results from these mini-experiments (did the user engage with the content?), yielding continuous optimization and experimentation instead of clunky step-by-step guess and test. This makes everything move faster. Even more importantly, RL allows ML to address problems of causation instead of just correlation. By randomizing which reviews you show to customers, you learn when a customer is engaged because of certain content. With offline ML, you learn mainly that customers read reviews that your previous policies promoted most prominently and not gain any insight about what they would read if you promoted different reviews. This causal inference capability allows you to automate far more policy design components than you could do with backwards-looking ML.
This might all sound obvious. Of course I should use ML to predict what customers want to see. But most decision makers do not make enough use of ML in their policy design. Website optimization with ML is so common that we take it for granted, but many other types of policies – from product development to operations to investment – should use ML components. On the flip side, a misunderstanding of ML (remember: it is just finding patterns) can lead you to outsource to algorithms decisions that properly require human judgement (aesthetic, ethical, or executive). If you are a policy designer who doesn’t understand what should and should not be automated with ML, and how policies that include ML will learn and adapt, then yourkey decisions will be delegated to engineers and scientists (or worse, they will not be made at all). You lose control of the decision making process and are no longer the actual decision maker.
Fortunately, you can adapt for this coming automation by building an understanding of how ML recognizes patterns in past data, how uncertainty around these patterns is quantified and mitigated, and how data are collected. If you have the stomach for a dose of math and snippets of code, my book Business Data Science is one place to start learning. But regardless of how you familiarize yourself, the key outcome is to be able to use ML as an ingredient in your decision process. Metrics, spreadsheets, and customer surveys have long been essential to decision makers. ML and RL are simply the next generation of tools to facilitate data driven decisions.