Stocks are processed in milliseconds. Chauffeured cars can be summoned almost instantly. Yet people are still routinely made to slow down and cool their heels. We wait in lines at the grocery store and the doctor’s office, on the phone with customer service, and virtually when buying concert tickets or waiting for a website to load.
But as consumers are coming to expect ever-faster service, companies are looking for ways to keep lines moving. Customers are, too, and have found some shortcuts. For example, if you’re stuck on hold with an automated phone system, it may fast-track you if you swear, according to the technology news site TNW.
Researchers specializing in queuing theory may have better solutions. For most of the past century, operations researchers developed systems to reduce lines, and they were largely successful at streamlining factories, telephone exchanges, and other tasks. But with the service sector now dominating the economy, some recognize that they need a new approach, one that takes into account human behavior. Many lines that form these days are affected by people’s quirks and biases—including their propensity to swear or hang up when frustrated by circuitous automated phone systems. Anticipating these reactions could help one person cut the line, but could also help many people more quickly get what they’re waiting for.
Get in the queue
Queuing theory is a branch of mathematics that optimizes waiting times. Some of the earliest work was done at the turn of the 20th century by the Danish mathematician Agner Krarup Erlang, who tried to predict load times for the telephone exchange so that people could pick up a receiver and hear a dial tone rather than a busy signal.
Since then, much work in the field has focused on waiting times in automated environments, such as manufacturing, or on fairness and scheduling in computer science. Consider a semiconductor factory, where silicon wafers are made for use in electronic devices. Research has developed ways to organize the factory so that each wafer is fabricated in the fastest and most efficient way. These factories are in some ways an ideal test case for wait times. The goods produced are delicate and expensive, giving companies an incentive to carefully but expeditiously move them through the production line and out the door. No company wants to build up unnecessary inventory and get stuck paying for storage. Traditional queuing theory assumed that jobs would wait forever to be processed. Those silicon wafers, after all, are inert and not going anywhere on their own accord.
But over a few decades, the manufacturing sector has contracted while service has expanded to represent roughly two-thirds of US economic activity. The trend has taken hold across the globe: according to the World Bank, the service sector added 65 percent in value to overall GDP in 2017, versus 62 percent in 1997. As this has happened, queuing theorists have become more interested in the human side of queues. In hospitals, restaurants, and shops, lines comprise people, not inanimate objects. People are at the front of lines, too, routing patients to hospital beds, leading hungry patrons to tables, or ringing up purchases.
With $200 billion in annual revenues, according to outsourcing expert CustomerServ, the call-center business represents a particularly good example of an industry that involves both long wait times as well as that crucial factor of human reactions. At call centers, both customers and customer-service agents are masses of feelings, biases, and restlessness. “We need to develop formulas that take into account human behavior; for example, customers hanging up,” says Chicago Booth’s Amy R. Ward.
This requires different math than used in the factory context. Most academics look at queuing theory by using stylized models that involve exponential distributions and require finite-state dimension. These models mostly reveal the way things should behave, but not the way things actually happen in the real world, says Amber Puha of California State University at San Marcos. Puha studies measure-valued processes, a tool that allows for writing equations to track how long every customer in a line has been waiting, and how long every customer being helped has been speaking to an agent.
To set up the kind of queuing problem that presents itself at a call center, Puha, Ward, and other operations researchers first divide customers into classes depending, for example, on the type of service desired. One person calling a bank might want to know her account balance, while another might want to report a stolen credit card. The goal is to design control policies that push customers through the process as quickly as possible. For most companies, the point at which someone calls in and needs to be paired with an agent is the start of the challenge. Quality of service is measured by short wait times, and the first point of contact affects the company’s reputation.
Customers who refuse to wait provide an example of how human behavior can short-circuit a traditional queuing model. People hang up when they’re frustrated. Wafers do not. “We’re interested in more fully understanding the implications of this, how to maximize revenue, minimize customer dissatisfaction, and maximize customers being happy with the service,” says Puha.
In one project, Puha and Ward set up a mathematical model that incorporates different types of customers, and use a probability model to distribute them, seeking to capture the behavior of frustrated humans hanging up when on hold. Then they analyzed how the system works when the number of call-center agents is very large.
The researchers thought first about what happens when call-center agents can do anything, such as speak every language. “Because this phenomenon of how abandonment [when customers hang up, or abandon a transaction] affects the overall system performance is not particularly well understood, it’s interesting to understand it even when agents are fully cross-trained,” Puha says. After that, they could study what happens when an agent has a specialization, such as speaking only English, or only Spanish. Would people calling in still hang up at the same rate?
Inequality on the line
A model that seeks to optimize the queue reflects a company’s goals and priorities, which typically include fairness. When establishing rules for how to route people, a company has to decide whether to treat customers more or less equally or to favor some over others. Will some groups of callers be forced to wait longer than others, on the basis of their frequent-flyer tier, hotel loyalty points, or order history? Say a customer has a credit card designed for big spenders, for which she pays a hefty annual fee. When she calls customer service and types in her credit-card number, should she be routed to a shorter queue for high-value customers? In many cases, companies have decided that yes, she should get this priority treatment.
Consequently, the conversation with the service representative who answers your call may be more pleasant and efficient when you are a member of a group the company wants to keep happy, and companies can use mathematical models to ensure this happens. “I can look at the solution to that optimization problem, rank the classes [by how valuable they are to me], and serve my highest priority customers first,” says Ward. “But that optimization problem is not capturing adverse consequences from treating customers unequally.”
Many of the problems Ward formulates focus on averages. If a center’s average wait time is one minute, that could be because everyone waits for one minute—or it could be because 98 percent of customers have no wait at all, while 2 percent of them wait for an hour. Those tails can have an impact on a business, so it makes sense to design a problem that incorporates both average and variability, Ward says.
The model Ward and Puha wrote can include fairness constraints. For example, given the number of agents staffed and the customer demand, there must be a certain amount of waiting, say 10 minutes on average. The question is: Should one customer group have one minute of waiting and the other group have nine minutes, or should the split be 50–50, or something else? In different situations, a company may need to be more or less worried about treating customers more or less equally, she says.
This ties in to the more general problem of social inequality, an issue that isn’t accounted for in call-center models but is very much a concern in real life. Routing a VIP customer to a shorter queue may make business sense, but it’s another reason people have disparate experiences that cause them to have differing views about, say, access to resources.
Recommended Reading How to Allocate Subsidized Housing More Efficiently
Poorly designed waiting lists can be inefficient and unfair.How to Allocate Subsidized Housing More Efficiently
The business decision a company has to make involves where on the axis of fairness it wants to sit. Taking into consideration a company’s priorities, call-center executives have to decide how many resources to allocate, such as the number of agents who answer the phones. In addition to where customers are placed in line, a model also determines what agent they are routed to. Pairing customers with agents requires companies to decide how many employees to hire at different skill levels. A call center’s scheduling algorithm can take into account which employees can serve which types of customers. Of course, it’s more expensive to hire only agents with top-level skills. Moreover, if the VIP customer is routed to a high-skilled agent when several other customers are waiting on hold, the decision affects everyone’s wait times, and those of future customers.
The next stage for managers is to give people incentives to do the most efficient thing. “The first step in solving the scheduling problem is to assume the employees will do exactly what I want,” says Ward. “Then I can deliver the same quality of service to different customer classes as somebody else, but with fewer employees, so my model would allow you to be more efficient. But what if the employees do not behave consistent with my solution?”
Agents are people too
Fairness and efficiency are concerns with respect to the way queuing models treat not only customers but also the call-center agents who interact with them. These representatives offer another set of potential behavioral issues that can crop up when humans don’t do what the algorithm dictates. Agents can have bad days, or be tired, or need coffee, or feel berated. They may work quickly or slowly. They have their own preferences for the types of calls they want to answer, and in some cases even dictate whether they will see a variety of call types, for instance.
“If we think of higher-touch services where humans are providing the service, the human element kicks in both on the customer side and the server side,” says New York University’s Mor Armony. Armony has coauthored two papers with Ward, and they have studied how to route calls to provide for agents to have time to take breaks. But here is the conundrum: Suppose you are the most-effective employee. Thanks to your efficiency, you tend to get more work routed your way, and you may take steps to protect yourself and your time if your manager fails to do so. An academic referee that produces quality reviews quickly will likely be asked to review more papers, so that person may decide to take longer to return reviews in order to keep the workload in check.
Raga Gopalakrishnan of Queen’s University in Kingston, Ontario, and Ward are conducting research together, and their investigations include how to account for agent burnout, how to design systems so that more-efficient agents don’t feel as though they are being punished when they are assigned more work than others, and how such considerations interact with customer behavior (less work being done by the agents means higher wait times for the customers).
Gopalakrishnan and Ward, together with Booth PhD student Yueyang Zhong, are currently looking at the strategic behavior of both customers and agents. In their model, an agent will choose how quickly to work. While customers in a call center can’t necessarily see where they are in a queue, customers in a grocery store often can, and both they and the cashiers sometimes make decisions accordingly, leading to complex interactions. Say one checkout line is shorter than several others. More customers may join the shorter line, and the cashier, seeing a longer line, may speed up. (See “Which type of grocery queue is better?” below.)
Gopalakrishnan, Ward, and Zhong are working to analyze how the customer and agent trends interact with each other in order to design service systems that would settle down into and operate at an efficient equilibrium.
From search requests to hospital beds
In mathematical modeling in general, and queuing in particular, tools used in one context can be applied in another. This holds true in behavioral projects, where the same models used to optimize call centers can be used to solve other waiting-time and resource-allocation problems in areas from restaurants to retail. Armony is using queuing theory to study patients’ wait times in hospitals. Although this is a physical problem—patients are present, rather than waiting on the phone—it shares many characteristics with call-center issues.
Research indicates this is something of a trick question, as the speediest option isn’t necessarily the one shoppers prefer.
The TV show MythBusters devoted a 2016 episode to testing both propositions. The show’s hosts herded volunteers through a stage set of a grocery store. When shoppers self-organized into multiple checkout lines, they waited 5 minutes and 39 seconds, on average, and gave the experience a satisfaction rating of 3.48 out of 5.
Then the show’s hosts organized the volunteers into a single checkout line, gave them the same instructions, and assigned the person at the front of the line to one of several registers. These shoppers ended up waiting longer, almost 7 minutes. However, they rated the experience a 3.8, presumably because this method was fairer.
The notion that one line is better is taught to many MBA students in operations-management classes. This idea can be validated mathematically, but the underlying assumption in the math is that the checkout clerks work at the same rate, regardless of the number of lines.
However, the math doesn’t take into account worker behavior—and both shoppers and cashiers affect how queues move. University of Washington’s Masha Shunko, Syracuse’s Julie Niederhoff, and Purdue’s Yaroslav Rosokha find, as MythBusters did, that the fastest way to move shoppers through a store is to let them choose a line alongside a register. But the researchers also find that the speed is due to the way checkout clerks work. With their own lines, clerks in an experiment worked faster, in part to compete with their peers on other registers. When there was a single line, by contrast, no individual cashier owned the results, leading to a loss of competitiveness—and checkout speed.
“The single-line workers didn’t work at the same speed as their counterparts doing the same task in the multiple-line system,” Niederhoff says. Instead, the single-line cashiers slowed down by about 10 percent. When everyone was equally responsible for getting customers out the door faster, it seems, nobody felt they had to rush. This decrease in worker speed can negate the benefits that the single-line design is predicted to create.
From the customer’s perspective, a single-line setup is preferable because it negates customers’ anxiety about which line to pick, she says. Despite her research, Niederhoff prefers, when possible, to stand in a single line when shopping.
“There’s no risk the person in front of you will sabotage the line by doing something slowly,” she says.
But what about a store that also has one or more express lines serving people with fewer items? In this case, the math suggests that it’s still probably best to have one regular line, in addition to the express line—although it depends on how much shorter the average checkout time is in the express line.
Masha Shunko, Julie Niederhoff, and Yaroslav Rosokha, “Humans Are Not Machines: The Behavioral Impact of Queueing Design on Service Time,” Management Science, February 2017.
People often think that emergency departments are busy because patients come in at random times, says Armony, but the distribution of patients by arrival time is actually fairly standard. In reality, while people don’t call a center or arrive at a hospital at a constant rate, there are patterns. Most people go to the ER at convenient times, such as after dropping their children at school, or on a Monday after getting sick on the weekend. Then these patients clog up the emergency department as they wait for a transfer to an equally overcrowded inpatient ward. Hospitals need to know how to get patients to beds faster, in order to treat patients expeditiously and make the best use of hospital resources.
“We know health-care costs are really high, and if people wait too long to get medical treatment, there could be severe effects. But [the problem] also translates into a mathematical theory that ends up being useful,” Armony says.
She observes that some hospitals have implemented tracking systems to winnow out the ER patients who arrive needing urgent care–type treatment, such as a prescription, rather than an inpatient bed. In the hospital she is studying, experienced triage nurses assign incoming patients a score of 1 to 5 on the basis of how likely they are to need an overnight stay in the hospital. The urgent-care patients are then treated quickly in a separate area and released. Most who stay in the ER are those who score a 3, while those scoring 1 or 2 may need surgery or a trip to the intensive-care unit.
Other research, including a recent paper Armony coauthored, deals with queuing for operating rooms, which involves optimizing the schedule so that patients can receive surgery quickly and find a recovery-room bed waiting when they are finished. When hospitals move patients into a nursing ward, they have to determine the order in which wards will take new patients, on the basis of the beds available and the patients’ medical problems. Hospitals can use queuing theory to make these assignments, but a real test is what happens within the hospital itself: Do nurses and doctors follow computerized rules implemented by the hospital or do they make their own decisions? And which group, the humans or the machines, makes better decisions for patients? Hospital executives know their trained staff have deep expertise in treating patients, but they also know that these staff members might not recognize the patterns a computer can see.
“The question is how to design the algorithm to support the decision makers in a way that would be helpful to them rather than intrusive or something they would just ignore,” says Armony.
A related issue is how hospitals route patients depending on the criteria that insurance companies and regulators use to judge the quality of care. Hospitals are penalized when patients who have been treated come back to the hospital too quickly, so some have an area for ER patients to wait under observation before they’re discharged, to make sure they are well enough to go home. This cuts down on readmission rates, thus presumably shortening future queues.
Health-care queuing is “not a problem that I think will ever be solved completely, as the state of the art continues to improve,” says Armony. Queuing problems can be solved mathematically, but researchers have much more to do in working toward better modeling human behavior. That way, when an airline call-center rep finally picks up your call, you may not feel quite so angry about the amount of time you’ve been waiting.
- Mor Armony, Carri W. Chan, and Bo Zhu, “Critical Care in Hospitals: When to Introduce a Step Down Unit,” Production and Operations Management, 2013.
- Mor Armony, Rami Atar, and Harsha Honnappa, “Asymptotically Optimal Appointment Schedules,” Mathematics of Operations Research, August 2019. https://pubsonline.informs.org/doi/abs/10.1287/moor.2018.0973
- Mor Armony and Amy R. Ward, “Fair Dynamic Routing in Large-Scale Heterogeneous-Server Systems,” Operations Research, May 2010. https://pubsonline.informs.org/doi/pdf/10.1287/opre.1090.0777
- Ragavendran Gopalakrishnan, Sherwin Doroudi, Amy R. Ward, and Adam Wierman, “Routing and Staffing When Servers Are Strategic,” Operations Research, July 2016. https://pubsonline.informs.org/doi/abs/10.1287/opre.2016.1506?journalCode=opre
- Amber Puha and Amy R. Ward, “Fluid Limits for Multiclass Many Server Queues with General Reneging Distributions and Head-of-Line Scheduling,” Working paper, August 2019.
- ———, “Tutorial Paper: Scheduling an Overloaded Multiclass Many-Server Queue with Impatient Customers,” Tutorials in Operations Research, 2019.
- Amy R. Ward and Mor Armony, “Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers,” Operations Research, January 2013. https://pubsonline.informs.org/doi/10.1287/opre.1120.1129
More from Chicago Booth Review
We want to demonstrate our commitment to your privacy. Please review Chicago Booth's privacy notice, which provides information explaining how and why we collect particular information when you visit our website.