Chapter 4: Web Search
Personalized search refers to search experiences that are tailored specifically to an individual’s interests by incorporating information about the individual beyond specific query provided. Pitkow et al. describe two general approaches to personalizing search results, one involving modifying the user’s query and the other re-ranking search results.
Google introduced Personalized search in 2004 and it was implemented in 2005 to Google search. Google has personalized search set up for not just those who have a Google account but everyone as well. There is not very much information on how exactly Google personalizes their searches, however, it is believed that they use user language, location, and web history.
Early search engines, like Yahoo! and AltaVista, found results based only on key words. Personalized search, as pioneered by Google, has become far more complex with the goal to “understand exactly what you mean and give you exactly what you want.” Using mathematical algorithms, search engines are now able to return results based on the number of links to and from sites; the more links a site has, the higher it is placed on the page. Search engines have two degrees of expertise: the shallow expert and the deep expert. An expert from the shallowest degree serves as a witness who knows some specific information on a given event. A deep expert, on the other hand, has comprehensible knowledge that gives it the capacity to deliver unique information that is relevant to each individual inquirer. If a person knows what he or she wants then the search engine will act as a shallow expert and simply locate that information. But search engines are also capable of deep expertise in that they rank results indicating that those near the top are more relevant to a user’s wants than those below.
While many search engines take advantage of information about people in general, or about specific groups of people, personalized search depends on a user profile that is unique to the individual. Research systems that personalize search results model their users in different ways. Some rely on users explicitly specifying their interests or on demographic/cognitive characteristics. But user supplied information can be hard to collect and keep up to date. Others have built implicit user models based on content the user has read or their history of interaction with Web pages.
There are several publicly available systems for personalizing Web search results (e.g., Google Personalized Search and Bing‘s search result personalization). However, the technical details and evaluations of these commercial systems are proprietary. One technique Google uses to personalize searches for its users is to track log in time and if the user has enabled web history in his browser. The more you keep going the same site through a search result from Google, it believes that you like that page. So when you do certain searches, Google’s personalized search algorithm gives the page a boost, moving it up through the ranks. Even if you’re signed out, Google may personalize your results because it keeps a 180-day record of what a particular web browser has searched for, linked to a cookie in that browser.
In order to better understand how personalized search results are being presented to the users, a group of researchers at Northeastern University set out to answer this question. By comparing an aggregate set of searches from logged in users against a control group, the research team found that 11.7% of results show differences due to personalization, however this varies widely by search query and result ranking position. Of various factors tested, the two that had measurable impact were being logged in with a Google account and the IP address of the searching users. It should also be noted that results with high degrees of personalization include companies and politics. One of the factors driving personalization is localization of results, with company queries showing store locations relevant to the location of the user. So, for example, if you searched for “used car sales,” Google may churn out results of local car dealerships in your area. On the other hand, queries with the least amount of personalization include factual queries (“what is”) and health.
When measuring personalization, it is important to eliminate background noise. In this context, one type of background noise is the carry-over effect. The carry-over effect can be defined as follows: when you perform a search and follow it with a subsequent search, the results of the second search is influenced by the first search. An interesting point to note is that the top ranked URLs are less likely to change based off personalization, with most personalization occurring at the lower ranks. This is a style of personalization, based on recent search history, but it is not a consistent element of personalization because the phenomenon times out after 10 minutes, according to the researchers.
The Filter Bubble
Several concerns have been brought up regarding personalized search. It decreases the likelihood of finding new information by biasing search results towards what the user has already found. It introduces potential privacy problems in which a user may not be aware that their search results are personalized for them, and wonder why the things that they are interested in have become so relevant. Such a problem has been coined as the “filter bubble” by author Eli Pariser. He argues that people are letting major websites drive their destiny and make decisions based on the vast amount of data they’ve collected on individuals. This can isolate users in their own worlds or “filter bubbles” where they only see information that they want to, such a consequence of “The Friendly World Syndrome.” As a result, people are much less informed of problems in the developing world which can further widen the gap between the North (developed countries) and the South (developing countries).
The methods of personalization, and how useful it is to “promote” certain results which have been showing up regularly in searches by like-minded individuals in the same community. The personalization method makes it very easy to understand how the Filter Bubble happens. As certain results are bumped up and viewed more by individuals, other results not favored by them are relegated to obscurity. As this happens on a community-wide level, it results in the community, consciously or not, sharing a skewed perspective of events.
An area of particular concern to some parts of the world is the use of personalized search as a form of control over the people utilizing the search by only giving them particular information. This can be used to give particular influence over highly talked about topics such as gun control or even gear people to side with a particular political regime in different countries.While total control by a particular government just from personalized search is a stretch, control of the information readily available from searches can easily be controlled by the richest corporations. The biggest example of a corporation controlling the information is Google. Google is not only feeding you the information they want but they are at times using your personalized search to gear you towards their own companies or affiliates. This has led to a complete control of various parts of the web and a pushing out of their competitors such as how Google Maps took a major control over the online map and direction industry with MapQuest and others forced to take a backseat.
Many search engines use concept-based user profiling strategies that derive only topics that users are highly interested in but for best results, according to researchers Wai-Tin and Dik Lun, both positive and negative preferences should be considered. Such profiles, applying negative and positive preferences, result in highest quality and most relevant results by separating alike queries from unalike queries. For example, typing in ‘apple’ could refer to either the fruit or the Macintosh computer and providing both preferences aids search engines’ ability to learn which apple the user is really looking for based on the links clicked. One concept-strategy the researchers came up with to improve personalized search and yield both positive and negative preferences is the click-based method. This method captures a user’s interests based on which links they click on in a results list, while downgrading unclicked links.
The feature also has profound effects on the search engine optimization industry, due to the fact that search results will no longer be ranked the same way for every user. An example of this is found in Eli Pariser’s, The Filter Bubble, where he had two friends type in “BP” into Google’s search bar. One friend found information on the BP oil spill in the Gulf of Mexico while the other retrieved investment information.