By Christoph Lagger , Mathias Lux , Oge Marques
We have developed a prototype of an adaptive video retrieval system that leverages the knowledge of users’ intentions (and their relationship to video genres/categories) uncovered by our recent studies on “user intentions while watching videos online”  to provide better search results and a user interface adapted to the intentions and needs of its users. The goal behind the development of this prototype is to provide a better solution to the problem of including the user’s context into video retrieval than the one offered by baseline video retrieval interfaces (e.g., YouTube). The prototype is called “You(r)Intent.”
The diagram of the prototype (see Figure 1) consists of three main blocks: (i) The user interface, which includes a text input box for typing a query and four buttons to communicate the intention to the system, namely:
- to learn something,
- to be entertained,
- to get informed, or
- to solve a task;
(ii) A ruleset, derived from the results of our studies , so that videos whose categories provide higher correlation with the user’s intention are ranked higher in the search results; and (iii) A collection of sources of video content, e.g., Vimeo, YouTube, or Khan Academy.
Figure 1: Block diagram of the prototype for an adaptive video retrieval system.
For better understanding, we have outlined a simple example scenario. Let us assume a user wants to learn about a specific topic and she types the query “moonwalk.” Since the user has a “learning intention” in mind, she clicks on the 'I want to learn something’ button. Our pattern-based ruleset then optimizes the search to certain sources and categories. In this example case, YouTube and Khan Academy might be used as video sources and the videos will be ranked by categories from the strongest to the weakest correlation to the user’s intention , leading to a result screen whose screenshot appears in Figure 2. Notice how three clips from the “How-to & Style” category (all of which are related to the moonwalk dance popularized by Michael Jackson) appear at the top of the screen, with the second-highest-ranked category (“Science & Technology”) appearing in a second block, mostly containing NASA footage from historical Apollo-era moon explorations. By prioritizing the categories that are most strongly correlated to the user’s intentions and adopting a visually pleasant layout that shows them in easily distinguishable blocks, we circumvent the ambiguity caused by the query term and provide an intuitive way to navigate to the desired result.
Figure 2: Result screen of the prototype when the user queries for videos containing the “moonwalk” keyword and expresses the intention to learn something.
After finishing the development of the first version of the prototype, we performed a user survey, asking users to perform specific video retrieval tasks, and to report on how easily they carried out those tasks and how satisfied they were with the results. Evaluation methods used for this survey included: observation of the interviewee, analysis of mouse tracking heat maps, activity logging, and semi-structured interviews with the participants. Overall, participants solved each task somewhat easily and were very satisfied while working with the prototype. Moreover, the position at which a video of interest (which would solve the task at hand) appears was considered satisfactory throughout all tests.
 C. Lagger, M. Lux and O. Marques, “What Makes People Watch Online Videos: An Exploratory Study”,ACM Computers in Entertainment (2012) [submitted to]