Business Daily from THE HINDU group of publications
Monday, Oct 01, 2007
ePaper


eWorld
Features
Stocks
Cross Currency
Shipping
Archives
Google

Group Sites

eWorld - Internet
Web Extras - Outlook
‘Search is a digression’

K. Bharat Kumar

Tech veteran on what Net users are looking for when they hit the search button.



Dr Prabhakar Raghavan

Not too many people give you examples of commendable work being done by competing companies, while explaining work that their own companies do. But you probably wouldn’t find too many people like Dr Prabhakar Raghavan, Head of Research at Yahoo! A Ph.D in computer science from Berkeley and a B Tech from IIT-Chennai, Dr Raghavan is also consulting professor at Stanford. He says it is only fair to talk about competition as well, in the spirit of ‘non-partisanship ’.

And, it’s easy to be intimidated by the title of books he has written: ’Randomized Algorithms’ and ’Introduction to Information Retrieval’, the latter being under publication. But Dr Raghavan takes you through the complex maze of Internet search and related services in a dazzlingly simple fashion. Excerpts from a conversation with K Bharat Kumar:

You have talked of the evolution of search: from objective rankings to advertisement based rankings; click-through advertisements to Search Engine Optimised (SEO) text. What do you see Internet Search evolving into now?

Billions of people use Search. Strangely, no one wants to search. They want to get tasks done. They want to run their lives. Search is a digression.

Assume you want to take a vacation. You search, get results, aggregate information… all this only to conduct another search. You spend a lot of time to get your task done.

The glaring inequity here is that, over time, all the machines that you used or accessed for information spent about five seconds, while you spent five hours. There is something wrong here. In a sense, what engines are trying to do now is to recognise that people ultimately have intents towards fulfilling tasks and to that end, our job is to offer a quick and pleasing resolution of users’ goals, instead of going through the sequence of, “query, you get 10 documents, query again and you get another 10 documents...” and so on.

Each engine is trying a bunch of things. In the spirit of non-partisanship, I shall give you a Google example and a Yahoo! example.

At Yahoo! type ‘Papa John’s’. It’s a pizza chain in the US. It’s also a publicly traded company. The top result is the home page of the company. That is to be expected. Along with this, I give two more links. I see that it could be an investor interested in the stock. So I give a link to Yahoo! Financials for related information about the company. On the other hand, you could be hungry. So, I give you phone numbers and addresses of outlets nearest your home. I am now trying to resolve possible goals.

So, we sequence your query, divine your intent and get you there as quickly as possible. It serves you and hence us well. So you think we are prescient and you come back again and again.

Google talks about universal search. Microsoft and we are thinking about it. We do not want to give users 10 matching documents but actually craft an experience that fulfils a need.

Airports have three-letter codes. For example, MAA stands for Chennai while SFO stands for San Francisco. If you type these two along with a date, Google gives you matching documents that could be nonsense. At the top, it also fills out a travel form with these two cities with a couple of dates with links to Expedia and the like. Anyone typing that in is almost certain to buy air tickets. So, why not give it to them, instead of some documents?

What impact does that approach have on search on a mobile phone?

Mobile form factor is limited. There is very little chance that I can give you 10 useful and matching documents. So when you type out a query, I don’t want to let you see one document after the other. I’d rather give you an experience.

It so happens mobile search has different requirements compared to PC search. You might want to do different things using your mobile and clearly you are not sitting at your desktop but are wandering around. You have an immediate need — restaurant, movie theatre, whatever. We launched OneSearch.

There, if you type out ‘Spiderman’, you don’t want 10 documents with it. It’s useless. So what we give you are links to a trailer, to a list of nearby theatres showing it, timings for the same, reviews and the like. This entails our going beyond retrieving and returning documents to you and going beyond that onto objects — such as reviews, trailers, etc, and juxtaposing them for you. We have to deal with objects and not documents.

We are in pursuit of this goal of intent-driven search. It is not only form-factor dependent, but dependent on the Internet Protocol (IP) address so as to ascertain location. A human can tell if a document is in English or not. But a computer cannot, so easily. For instance, Indonesian and Malay use the same script so we have to use cues such as the IP address it comes from. ‘The’ does not mean much in English, but in French, it’s tea, a drink. So I have to do something different for that.

Your team comprises not just Ph.Ds in Computer Science but those from other fields as well. Why is it important for you?

Search has been (traditionally) related to computing technology. We see if a document is relevant to your query and take our best shot at ranking and so on. But, when there is notion of intent, our first problem is to figure out that intent. And then, it’s not a matter of relevance of a document. I have to figure out if the user gets satisfied. How do I know that he is or is not? Cognitive psychology, not IT or Computer Science, comes into play. Here there is a blend into humanities. That’s a fascinating turn of events for computing technology. It’s a consequence of the user base of a billion people not being trained scientists and who don’t exactly know what they want but type what they type.

There’s a parallel to this on the advertisement side. How do you best place click-on ads? Here is a confluence of computing and microeconomics. We have economists who talk about auction theory for pricing, equilibria for bidder strategies and so on.

One of the things I did over a year ago, after I joined, I figured I had to have a world class economics department. We have a micro-econmics department. Our first hire was Dr Schwarz. His students are doing some seminal work on auction pricing for ads.

Preston McAfee, Dean of economics and social Science at Caltech, heads my economics department. They are fascinating and think differently from people like myself. Leads to very interesting questions — how do you use game theory to predict behaviour, for instance. Fascinating.

Any examples of contributions from these people to the way you think about your user base?

Yes. What role do economists play here? We present experiences here. Unwittingly, you could write code that sets up economic barriers. It lowers the value of the system. Look at our Personals (or dating) site. Before the economists got involved, it was free. It wasn’t popular. Then it went paid and popularity shot up. That looks counter intuitive to us. To an economist, it makes complete sense. The reason is this: by creating scarcity, you’ve made the club more elite and hence all of a sudden, it is interesting for people to go there. They don’t find spam and junk but find value now.

Another example: Anyone can come into Personals and browse listings. Only a paying member can send a message, or a proposition, to someone else.

The interface doesn’t tell you if the recipient is a member or not. That damages the value of the system dramatically. The sender does not know if the recipient can respond. So they spam and send out millions of messages. The recipient gets millions of such messages.

If you can budget the number of messages someone sends, and if you could surface who is a member and who is not, it increases the value of the system. These look like interface changes but an economist thinks of these things. These changes rely on principles that do not stem from computer science.

And conversely as CS professionals, we can design systems that do huge damage to economic value and not even realise it.

Next, sociologists and cognitive psychologies are valuable and tell us how people behave. One fascinating area for us is how networks of people behave and influence each other. If you have 50 friends and another has five — are you 10 times or twice or 100 times as valuable to Yahoo as the other person is? These are little understood, for now.

Dr Duncan Watts joined us a few works ago. He wrote the book, “Six degrees – the science of a connected age.” It builds on the networks of acquaintanceships; there are six hops between any two people. He has done pioneering work on that. People like him give us insight into how network of users behave. There are theories that are hard to test in the physical world but it is possible to see how they unfold on the Net.

For instance, there is the concept of Mavens who are influencers and that is whom you should be targeting to spread the good word for you.

We hear that you’d like at least one of your team to win the Nobel Prize.

(Smiles) That’s an aspiration. But what we are aiming at is scientific influence at the highest level. We should become comparable to the highest research institutions in the world. For a research institution as young as we are, our scientists have begun winning prizes and that pleases us. May not be the Nobel, but there are other prizes (smiles).

You have talked about this supply chain of information. Since your coming on board, how much have you traversed, and any ideas that you brought to bear on the team?

We have done a number of things… We brought in very heavy-duty talent. Two things you have to do with world class talent is to hire them and to leave them alone. I focus on hiring them and try to stay out of the way.

We are in an immature, early industry that is moving extremely fast. So, I am not thinking about what influence I would have in 50 years. A lot of senior, mature people have been around long enough and when they see a biz problem, they attack the underlying issue. It’s started to show in things happening in our products and in the next two years, it would have business impact.

In the past, you have indicated that Google may be good at Search but that Yahoo! is in many more areas and that it is best placed to exploit the characteristics of the Net medium. How is that battle being played out?

Net users engage in a spectrum of activities. There are a number of touch points. Our industry is young but you have broad categories into which you can break down net usage. Search is one. Communication is the second broad bucket. We are the largest e-mail base and one of the largest IM base. People spend a lot of time on our communication platform. The third bucket is where people tend to consume content — but in the last two-threeyears, they have also started creating content.

That is interesting. There is clearly value there. Reviews are an example; uploading and tagging photos on Flickr, tagging pages on Del.icio.us is another. The trick that no one has figured out here completely is, how do you complete a virtuous cycle so that contributors also gain and people are incentivised to contribute content?

It’s not necessarily money as incentive. I should feel good that something useful emerges out of it. In Yahoo! Answers, you get points so you get a rush. Question is how sustainable is that and how much value is generated from the content that is unique and defensible. It’s glib to say that Yahoo! Answers succeeded and that Google shut down its version. Now comes the ’then what?’ question.

The challenge for us is to figure out underlying motives and why people do what they do. Maybe in search there is intent to get task done. Communication is more casual – human nature is to communicate. As to content creation and consumption, human nature plays a role there too. But you need to get the expression to create greater good.

Communication and content creation and consumption will start to blur. It’s not age-related nor does it have to do with money. Why do people play video games? If anything monetary incentive is negative. There is a rush from seeing your name on the leader board. Is that sufficient to create content? It has something going for it. So enough people participate and contribute. It goes back to social and psychological study. Those specialists have to explain to us why people want to do this and contribute. Then economists should come in and tell us to set the points system in such a way so that we can maximise value. It is a combination of audience growth and knowledge growth. These are hard problems and we barely know how to solve them.

We had a product and it took off. We now want to take a harder look and make it succeed. By success I mean, “Engage a billion people.” Right now it’s at 100 million.

How do you measure yourself and your team?

Because we are a research organisation, scientific leadership is an important criterion, we need people to lead in their area. We look at mundane measures that research labs use — patents filed, etc. But on the other hand, if all we did was file a bunch of patents, that is not good enough either.

It has to be something deeper. The hope and the vision is to have pervasive and large scale business impact. That would eventually be measured in dollar terms. Large-scale business impact for us is not even $10 million but $1 billion. So when a scientist looks to do something on the business side, we encourage the seeking of big business impact projects. Something you hope, if it works out well, a billion dollars. Not, in the best-case scenario $5 million.

You will often hear as research philosophy. You have to fail often. If you don’t, then you aren’t trying hard enough.

That is the metric for us — failure!

A few words on your India centre?

Our India centre does software for a number of areas. Datamining is one. We accumulate about 20 terabytes of data; we mined that and use it to improve user experience — a good deal of that is in Bangalore. Good chunks of our e-mail and instant messaging software are built in Bangalore.

We are growing our effort on the search and advertisement effort front. For instance, our safe-image searching feature was developed in Bangalore. Parents don’t want kids to view pornographic images. A lot of that filtering based on sophisticated image analysis is done here.

But, building a research organisation has its challenges. In Bangalore, we have an advanced technology group. We don’t have a research group there since it is very hard to find researchers in India. The US produces 1,400 Ph.Ds a year in Computer Science, China produces 3,000, Israel has 50. India has only 30 a year.

There is paucity at the supply side. That makes it hard to come in and say we are going to open a world-class research centre. Even then, we can get talented people who can make good use of our resources and that’s why Tech group that does nice work.

We also set up a group more focused on academic relations around the world. Once you reach a certain size and maturity, you put in a lot of care in deepening relations with academia.

In addition to the advanced Technology group, we have an engineering group that builds software for the scientists. These are research engineers, not Ph.D.s, but their job is to work with scientists, build up experiments, demos and proofs of concept.

bharatk@thehindu.co.in

More Stories on : Internet | Outlook

Article E-Mail :: Comment :: Syndication :: Printer Friendly Page



Stories in this Section
Towards the Wee PC


Mouse-powered!
‘Search is a digression’
Browsing trouble
‘Future bright for captive BPOs’
Connect gadgets - not viruses
Piracy watch
‘Rupee appreciation here to stay’
Quiz
IT as Alan Greenspan sees it
Cartoon
Debut time
Sound option


The Hindu Group: Home | About Us | Copyright | Archives | Contacts | Subscription
Group Sites: The Hindu | The Hindu ePaper | Business Line | Business Line ePaper | Sportstar | Frontline | The Hindu eBooks | The Hindu Images | Home |

Copyright © 2007, The Hindu Business Line. Republication or redissemination of the contents of this screen are expressly prohibited without the written consent of The Hindu Business Line