Proposal Defense: Pei-Chia Chang, "A Personalized Recommender Agent for the World Wide Web - A Semantic Perspective" (5/1/2009)
May 1, 2009, 4:00pm, POST 302
Pei-Chia Chang
May 1, 2009
4-5 p.m.
Post 302
Cross-system personalization (CSP) aims to provide personalization based on profiles and protocols among different service systems (Mehta, Niederée, & Stewart, 2005). This research topic is closely related to the popularity of semantic user modeling and distributed systems (Dolog & Nejd, 2003). CSP is important in that it attempts to match dynamic information with diverse user interests as recommendations, which alleviates the problem of information overload. However, CSP has its challenges, such as conceptual and technical methodology for usage collection, analysis and unifying user models from different sources (Niederée, Stewart, Mehta, & Hemmje, 2004).
Given the framework of CSP, this work investigates the research question “In terms of semantic relevance, what kind of automated process finds the best matches between a user’s topical interests and a web page’s content?” Focusing on modeling individual web users’ long-term interests at the client-side, a personalized recommender agent is constructed as a desktop application. The agent aims to provide page recommendations from different website sources. This is achieved by constructing a Wikipedia-derived ontology and using it to formulate content and user models for recommendations. Information extraction and information retrieval techniques (TF-IDF, Wrapper, etc.) are applied to extract categories and keywords from Wikipedia in a targeted domain, i.e. computer science.
Our main hypothesis assumes: if user models align with content models, semantic relevance can be addressed for CSP. We will evaluate various model parameters (schema, interest indication, relevance feedback etc.) and their performance using computer science professionals. The determinants of the semantic relevance of a recommended page lie in explicit judgments on topicality and novelty, and implicit clicking behavior. We will also compare modeled recommendations versus random selections. Preliminary tests of the system may indicate that the agent is, using the Wikipedia-derived ontology, capable of mapping an unknown webpage to its related topical categories fairly as a content model. We will use the content models as user models for formal evaluations.
Regarding the significance of this work, using Wikipedia to derive ontologies and automate indexing is a worthwhile attempt. In addition, the agent system is deliberately constructed as a research platform for heuristic information extraction. Researchers will be able to implement more heuristics on top of the platform. The agent also facilitates knowledge evolution by bring people’s attention to the progression in the field. As the conclusions, we will eventually provide design principles and guidelines for CSP.

