The Information Explosion
Harnessing the Power of Big Data
By Riley Mackenzie
What studio executive could have imagined it: a movie about the role of computer analysis in assembling a winning baseball team that was nominated for six Oscars and cleaned up at the box office?
OK, it didn't hurt that Brad Pitt played the hero. But the 2011 movie "Moneyball," about how the Oakland A's harnessed the power of statistics to turn its fortunes around and set an American League record for consecutive wins, introduced a lot of people to what's now a hot topic in business and many other fields: big data.
When manager Billy Beane crunched the numbers and started hiring players based on the specific ways they could have an impact on the game, it was big data he was using. Since then, the use of big data has gotten, well, a whole lot bigger. And, say School of Management professors who are teaching and using the techniques of big data analysis, the field opens unlimited opportunities for those with the tools to peer into massive databases and profit from what they can find there.
An explosion of information
Information is power, an old saying goes. It follows, then, that more information means more power.
At base, big data is an unimaginable lot of information-a collection of data sets so large and complex that traditional database management can't handle them. For comparison, the average laptop these days might have 500 gigabytes of hard drive memory (and when was the last time you filled up your hard drive?); Google alone processes more than 24 million gigabytes of data daily, according to a presentation on business intelligence that H. Raghav Rao, a SUNY Distinguished Service Professor in the Management Science and Systems Department, gave recently in Chennai, India. All that exchange of information adds up: Some estimates say the volume of business data worldwide, across all companies, doubles every 1.2 years.
"If you're not able to analyze, interpret and apply the data in context, big data is worthless."
Professor and Chair, Management Science and Systems
Driving this explosion in available information is the spread of mobile devices including cellphones (4.6 billion worldwide, Rao says), radio frequency identification tags (more than 30 billion and counting), Web traffic, surveillance cameras, and digital sensors in industrial equipment, cars, electrical meters and shipping crates. And, says Ramaswamy Ramesh, professor and chair of management science and systems, the rapid growth of available computer power, including cloud computing and cluster computing, has made it possible for companies and even individuals to analyze big data effectively, an ability once enjoyed only by governments and universities with access to giant supercomputers.
Those who study big data define it by four characteristics, Ramesh says:
- Volume - the sheer amount of information to be analyzed.
- Velocity - the speed at which information arrives and changes.
- Veracity - the trustworthiness of the data.
- Variety - the different forms in which data is gathered, from Twitter messages to electrocardiogram results.
"The frequency of data transmission is now at the millisecond level, beyond what humans can distinguish. The exchange is in effect a machine."
Associate Professor, Finance and Managerial Economics
"The big question is, can this data be put to use?" Ramesh says. "If you're not able to analyze, interpret and apply the data in context, big data is worthless."
Says Debabrata Talukdar, professor of marketing, "These characteristics imply two big challenges for effective leveraging of big data: database management-how to organize, store and manage such data sets to make them amenable to analyses; and data mining and analysis tools-how to develop and apply appropriate analytical and statistical techniques to gain strategic insights that are likely to be embedded in the raw data."
On the job for business
The big data revolution, these professors say, has only just begun. "Almost all sectors in the modern economy are being changed in significant ways by big data," Talukdar says. "That includes retail, health care, manufacturing and logistics. Essentially, big data helps both consumers and firms to better reduce the information uncertainty related to their decision environments and thus to make potentially better decisions.
"For example, think of the tremendous surge in publicly available data on the Internet that consumers can search for information on product prices and peer feedback on product quality. For businesses, big data is helping to achieve more customized or effective targeting in ads and price promotions, and to make better market sales forecasting. While big data is not going to usher in a 'perfect information' decision environment for consumers or firms, it is increasingly helpful."
Similarly, the finance sector-which is all about the exchange of information-has taken the techniques of big data to an extreme, says Cristian Tiu, associate professor of finance and managerial economics. Tiu, who studies nonstandard investors such as hedge funds and endowment managers, says stock trading has become so automated and so fast that stock exchanges are living examples of big data in action.
"It's just impossible to sit in a trading pit and see what's going on," Tiu says. "The frequency of data transmission is now at the millisecond level, beyond what humans can distinguish. The exchange is in effect a machine. You basically have a chip stuck in the machine, and the exchange has an algorithm that matches buyers and sellers."
And in other fields
Those who study big data say it's a concept that goes beyond the world of business. Data analysis can identify disease trends, improve medical treatment, fight crime, mitigate traffic congestion, even win elections. President Obama's 2012 re-election campaign, for example, employed sophisticated data mining techniques that enabled volunteers to microtarget voters in specific counties in key states.
In health care, researchers have found that Google search requests for terms like "flu symptoms" and "flu treatments" spike a couple of weeks before hospital emergency rooms in a particular region see an influx of flu patients. Mining that data means that health care workers can be ready and epidemiologists can help keep the disease from spreading nationwide.
"You've got millions of people tweeting, millions of data points that we can collect."
H. Raghav Rao
SUNY Distinguished Service Professor, Management Science and Systems
Similarly, Ramesh points to the enormous amounts of medical data available on people with epilepsy-information that, properly analyzed, can help doctors treat these patients appropriately and can help epileptics know when a seizure is imminent.
Perhaps the most controversial use of big data analysis has been on the government level, with entities such as the National Security Agency coming under fire for what some consider intrusive data-gathering from social media and cellphone calls. But, as Ramesh argues, "The level of terrorism has come down enormously in the world since 9/11, and a major factor is big data analysis. The U.S. and other governments across the world play a central role in this. There is so much signal and noise, and the ability to sift through them is the critical component."
Under a grant from the National Science Foundation, Rao is working on a study of Twitter messages that were sent in the immediate aftermath of the Boston Marathon bombing in 2013. The goal is to understand how rumors spread and how "anti-rumors" can blunt the harmful effects of rampant speculation. "You've got millions of people tweeting, millions of data points that we can collect," he says. "Each data point, each tweet, gives you a sense of the sentiment at that particular point. You can tell whether the sentiment is conducive to spreading rumors or not."
Preparing the next generation
The surge of interest in big data and how it can be used is reflected in the School of Management curriculum. The professors say students, who typically have grown up in an era when every product carries a bar code and every visit to a Web page is tracked by cookies, expect to hear about big data applications in all sorts of courses.
"The businesses, and thus students as their future employees, are asking for courses that teach database management and data mining and analysis tools, especially geared toward handling big data," Talukdar says. "In my teaching area of marketing, many courses are incorporating relevant data mining and analysis tools together with the use of sample subsets of big data from the real world," such as scanner data that retailers collect at checkout.
"If a manager is given information faster than he is digesting it, then you have information overload and a cognitive problem."
Associate Professor, Management Science and Systems
Students who want to go deeper typically learn the skill of computer programming-they "code." Some graduate students also are taking advantage of the Data Intensive Discovery Initiative, a newly developed collaboration among several UB academic units that provides greater computing power than the departments could muster separately.
"Typical PhD students, whatever they have to do, they do it on a computer," says Tiu, who coordinates the doctoral program for finance students. "Our students do code, but they need this expertise. It's very interesting to be able to go to a computer scientist and ask, to what degree can I do this?"
Raj Sharman, associate professor of management science and systems, notes that the school's influence in big data is also being felt worldwide as part of the international Workshop on Information Technology and Systems, held in December 2013 in Milan, Italy. Sharman served as one of two program chairs for the event, and the UB School of Management co-sponsored the workshop with Indiana and Penn State universities and IBM.
No magic bullet
An old axiom of computer programming says, "Garbage in, garbage out," and the School of Management professors emphasize that the mere existence of big data is no guarantee of high-quality results.
Says Sharman: "Currently we are peddling what we know: predictive analysis and data mining. For the practitioner, this is fine. For a researcher, sipping on that old bottle of wine is a recipe for obsolescence. New methods have to come in."
One concern, he says, is that information can be gathered and analyzed at a speed beyond human capacity. "If a manager is given information faster than he is digesting it, then you have information overload and a cognitive problem," Sharman says. "It is the task of business and researchers to deal with that data and provide it to each individual at the rate he can consume. If they don't do that, we have exceeded human competence. The researcher and the analyst have to cook the meal so it is digestible."
In the stock market, Tiu notes that one concern about the velocity of big data is that regulators, such as the U.S. Securities and Exchange Commission, face challenges in their ability to oversee the markets. "It decreases their ability to see wrongdoing," he says, when trades are made so quickly-"at the millisecond level"-and so frequently, and thus generate mountains of trading records. The SEC could ask to see a firm's trading code, he says, "but that itself may be voluminous."
He also points to market volatility tied to automated trading, such as the day the Associated Press' Twitter feed was hacked and a tweet was sent out saying the president had been injured in an attack on the White House. The AP corrected the mistake immediately, but program trading algorithms that scan the news caused the Dow Jones Industrial Average to fall 143 points before the market recovered in just a few minutes.
The way forward
Beyond such concerns, though, the professors say big data presents big possibilities for business. Talukdar cites, for example, a recent study by the management consulting company McKinsey saying that big data can help retailers to increase their operating margins by 60 percent and enable U.S. health care providers to improve their efficiency to the tune of $300 billion in annual savings.
"[Big data analysis] skills are fundamental for firms to gain a competitive advantage in the global marketplace, and thus they are critical for national economic growth and productivity."
One limiting factor, he says, is the need for managers and analysts who are conversant in the use of big data. The McKinsey study, he says, "reports that in the near term, the United States faces a shortage of 140,000 to 190,000 people with high-level relevant analytical skills as well as 1.5 million managers and analysts to routinely analyze big data and make decisions based on their findings."
But, he says, "such human resource skills are fundamental for firms to gain a competitive advantage in the global marketplace, and thus they are critical for national economic growth and productivity."
Says Ramesh: "Big data has been here for a long time, but we did not have the technology for processing it. If I was flooded with information, I would use only a small portion because I did not have the technology to process the data and see the big picture. Now I can see both the forest and the trees. And this is only going to get bigger."
Riley Mackenzie is a Buffalo freelance writer.