Map Reduce Data Mining and Ranking
Given a Google Search Query, find the AdWords Categories best describing it
Language Vocabulary Sampling
You are given a stream of conversation in a mysterious language from planet Mars. Approximate the set of word vocabulary in this language.
MapReduce parallel processing of all known URLs for median length
You are given the data set of all Google crawled URLs on the Internet -- that's a very large set. Write an algorithm to find the median URL length.
Google search results for the top 1000 highest ranking URLs
You have trillions of URLs stored uniquely, without order, across numerous machines, each with a score. Find the 1000 URLs with the highest scores.