The Google logo, displayed at the opening of the new Google data center in Eemshaven, The Netherlands.CreditVincent Jannink/European Pressphoto Agency
Wealth and influence in the technology business have always been about gaining the upper hand in software or the machines that software ran on.
Now data — gathered in those immense pools of information that are at the heart of everything from artificial intelligence to online shopping recommendations — is increasingly a focus of technology competition. And academics and some policy makers, especially in Europe, are considering whether big internet companies like Google and Facebook might use their data resources as a barrier to new entrants and innovation.
In recent years, Google, Facebook, Apple, Amazon and Microsoft have all been targets of tax evasion, privacy or antitrust investigations. But in the coming years, who controls what data could be the next worldwide regulatory focus as governments strain to understand and sometimes rein in American tech giants.
The European Commission and the British House of Lords both issued reports last year on digital “platform” companies that highlighted the essential role that data collection, analysis and distribution play in creating and shaping markets. And the Organization for Economic Cooperation and Development held a meeting in November to explore the subject, “Big Data: Bringing Competition Policy to the Digital Era.”
As government regulators dig into this new era of data competition, they may find that standard antitrust arguments are not so easy to make. Using more and more data to improve a service for users and more accurately target ads for merchants is a clear benefit, for example. And higher prices for consumers are not present with free internet services.
“You certainly don’t want to punish companies because of what they might do,” said Annabelle Gawer, a professor of the digital economy at the University of Surrey in England, who made a presentation at the Organization for Economic Cooperation and Development meeting. “But you do need to be vigilant. It’s clear that enormous power is in the hands of a few companies.”
Maurice Stucke, a former Justice Department antitrust official and a professor at the University of Tennessee College of Law, who also spoke at the gathering, said one danger was that consumers might be afforded less privacy than they would choose in a more competitive market.
The competition concerns echo those that gradually emerged in the 1990s about software and Microsoft. The worry is that as the big internet companies attract more users and advertisers, and gather more data, a powerful “network effect” effectively prevents users and advertisers from moving away from a dominant digital platform, like Google in search or Facebook in consumer social networks.
Evidence of the rising importance of data can be seen from the frontiers of artificial intelligence to mainstream business software. And certain data sets can be remarkably valuable for companies working on those technologies.
A prime example is Microsoft’s purchase of LinkedIn, the business social network, for $26.2 billion last year. LinkedIn has about 467 million members, and it houses their profiles and maps their connections.
Microsoft is betting LinkedIn, combined with data on how hundreds of millions of workers use its Office 365 online software, and consumer data from search behavior on Bing, will “power a set of insights that we think is unprecedented,” said James Phillips, vice president for business applications at Microsoft.Photo
A Google data center in Oklahoma. A new Google business offering — still in the test, or alpha, stage — is a software service to improve job finding and recruiting. CreditGoogle
In an email to employees, Satya Nadella, Microsoft’s chief executive, described the LinkedIn deal as a linchpin in the company’s long-term goal to “reinvent productivity and business processes” and to become the digital marketplace that defines “how people find jobs, build skills, sell, market and get work done.”
IBM has also bet heavily on data for its future. Its acquisitions have tended to be in specific industries, like its $2.6 billion purchase last year of Truven Health, which has data on the cost and treatment of more than 200 million patients, or in specialized data sets useful across several industries, like its $2 billion acquisition of the digital assets of the Weather Company.
IBM estimates that 70 percent of the world’s data is not out on the public web, but in private databases, often to protect privacy or trade secrets. IBM’s strategy is to take the data it has acquired, add customer data and use that to train its Watson artificial intelligence software to pursue such tasks as helping medical researchers discover novel disease therapies, or flagging suspect financial transactions for independent auditors.
“Our focus is mainly on nonpublic data sets and extending that advantage for clients in business and science,” said David Kenny, senior vice president for IBM’s Watson and cloud businesses.
At Google, the company’s drive into cloud-delivered business software is fueled by data, building on years of work done on its search and other consumer services, and its recent advances in image identification, speech recognition and language translation.
For example, a new Google business offering — still in the test, or alpha, stage — is a software service to improve job finding and recruiting. Its data includes more than 17 million online job postings and the public profiles and résumés of more than 200 million people.
Its machine-learning algorithms distilled that to about four million unique job titles, ranked the most common ones and identified specific skills. The job sites CareerBuilder and Dice are using the Google technology to show job seekers more relevant openings. And FedEx, the giant package shipper, is adding the service to its recruiting site.
That is just one case, said Diane Greene, senior vice president for Google’s cloud business, of what is becoming increasingly possible — using the tools of artificial intelligence, notably machine learning, to sift through huge quantities of data to provide machine-curated data services.
“You can turn this technology to whatever field you want, from manufacturing to medicine,” Ms. Greene said.
Fei-Fei Li, director of the Stanford Artificial Intelligence Laboratory, is taking a sabbatical to become chief scientist for artificial intelligence at Google’s cloud unit. She sees working at Google as one path to pursue her career ambition to “democratize A.I.,” now that the software and data ingredients are ripe.
“We wouldn’t have the current era of A.I. without the big data revolution,” Dr. Li said. “It’s the digital gold.”
In the A.I. race, better software algorithms can put you ahead for a year or so, but probably no more, said Andrew Ng, a former Google scientist and adjunct professor at Stanford. He is now chief scientist at Baidu, the Chinese internet search giant, and a leading figure in artificial intelligence research.
Rivals, he added, cannot unlock or simulate your data. “Data is the defensible barrier, not algorithms,” Mr. Ng said.