About

U-Sem is a user modeling infrastructure for the Social Web that is developed by the WIS group at TU Delft in collaboration with the Knowledge Representation and Reasoning (KRR) Group at Leeds University. Funded and inspired by the ImREAL (FP7) research project, it provides means for creating user modeling services that support specific application needs. U-Sem provides services such as detecting what countries a particular user has visited, or what their interests are. U-Sem also provides a development environment where services can be created by combining functional components in workflows, using a workflow engine and a graphical interface to manipulate workflows (RDFGears).

With the experience in adaptive Web-based systems, the research has focused on making the next step in user modeling for adaptation, by tapping into the Social Web. Leveraging the knowledge that is 'hidden' in the Social Web, allows adaptive systems and applications to enrich and extend their user modeling capability beyond what they 'know' about their users from their own system or application (e.g. logs).

In the ImREAL project this research was driven by the goal to augment adaptive training and learning applications (e.g. simulators) with knowledge that can be derived from the Social Web about the trainees and their context and background. Exploiting this knowledge from the Social Web enables the adaptive training situation to become more aware of the real world in which the trainees perform.

Scientifically, tapping into the Social Web to enrich adaptation is a complex challenge. It requires a deep understanding of the information that is 'hidden' on the Social Web and of the methods and techniques to extract this information from that data. In turn, this information can provide relevant user modeling knowledge to adaptive systems. Adaptive systems and applications typically ask for a complex set of relevant user-related knowledge. Therefore, the U-Sem user modeling infrastructure is developed as a platform to perform this science by providing user modeling services that can help 'unlocking' this knowledge from the Social Web.

The research approach we followed is separated in the following main stages:

  • Semantic Enrichment, Linkage and Alignment: processing and enrichment of semantics of the available usage and user data, by applying enrichment strategies for different types of data, such as Twitter, Flickr, Instagram
  • Analysis and User Modeling: processing of data for building learner profiles from social web and linked data, by identifying the most suitable sources to derive learner profile information and content enrichment, and by combining social web services data (such as Twitter and Flickr) for generating better learner profiles. The investigations included analyzing the potential of different (social) Web sources for user modeling & augmentation, including Twitter, Flickr, YouTube and DBpedia.
  • Adaptation and personalization: augmenting the modeling of learners with real-world context taken from the Web and the social Web.

The U-Sem infrastructure is continuously under development, being augmented with the experience from ongoing studies and evaluations. One of the lines in U-Sem research concerns Twitter, with TweetUM - the Twitter-based user modeling framework - being one of the 'spin-off' products of U-Sem. Further research lines include, for example, the domain of Smart Cities and the collaboration with the Amsterdam Institute of Advanced Metropolitan Solutions, with tools for mining semantics in Social Web user data, and user modeling functionality for inferring the cultural background of people based on Social Web streams. The aim is to support digital cities of the future through innovation in smart data services to the public. Leveraging the Social Web-based knowledge about users proves to be a good testbed for developing smart data services, such as SocialGlass. Another example is again in the domain of e-Learning, with Learning Analytics to support Massive Open Online Courses.

The results and outcomes from U-Sem research is also applied in other domains and projects. For example, some of the analytical approaches have also inspired the work in the Twitter Incident Management platform collaboration where the mining of social media is targeted at the support of public safety in times of crises.

To top

Architecture

U-Sem allows developers to create and design services for enriching and analyzing user-related data and makes these services available to client applications. The architecture of the U-Sem user modeling infrastructure follows the state of the art model for semantic-based user model augmentation and is depicted below.

The bottom left layer shows the services for preprocessing user data, such as learner and context data, to align it with the demands from the target applications (Semantic Enrichment, Linkage and Alignment). The bottom right layer shows the services for actual user modeling and analysis (Analysis and User Modeling). At the layers above the services are composed and orchestrated (Orchestration Logic) and the resulting models and analysis are made available for specific applications, such as adapting an application (U-Sem Application Logic). Key components of U-Sem are:

  • Semantic Enrichment, Linkage and Alignment: plug-ins that enrich the semantic meaning of user-related data. Given observations about the user (e.g. usage data such as click-through data or other sorts of log data), user profile information or domain knowledge (e.g. descriptions of resources a user interacted with), these components clarify the semantics of that data so that it can be interpreted by the user modeling components. Alignment components aim for resolving problems caused by heterogeneous schemata or ambiguity of terms.
  • Analysis and User Modeling: plug-ins that analyze the (enriched) user-related data for inferring user profiles. U-Sem allows for a variety of user modeling modules ranging from plug-ins that infer profile attributes such as interests, knowledge, skills or demographic characteristics for individual users to plug-ins that rather follow a stereotype-based user modeling approach and therefore analyze data related a community of users.
  • Orchestration Logic: engines that allow orchestrating a set of plug-ins from the layer below in order to provide workflows/orchestrations that provide certain user modeling functionality. The orchestration logic allows for executing enrichment and user modeling pipelines which are composed of plug-ins of the semantic enrichment and user modeling layer. Such orchestrations of plug-ins are made available as services to U-Sem clients via the U-Sem endpoints.
  • Endpoints: interfaces that allow client applications to call U-Sem services (the orchestrations created by the Orchestration Logic). Dialog-based support allows for negotiation between U-Sem and U-Sem clients where U- Sem may augment profile information sent by a client until it meets the requirements of the client.
  • U-Sem Application Logic: controls the U-Sem application flow (based on incoming requests), i.e. it connects the endpoints with the orchestration logic and plug-ins and provides functionality related to access control or plug-in management.
  • U-Sem Clients: client applications that connect to U-Sem via the endpoints. U-Sem clients are software components as well, i.e. communication with the actual end-users for which U-Sem may infer interests, knowledge skills, etc. is thus encapsulated via client applications (e.g. ImREAL simulators).

To top

Scientific and Commercial Applications

  • Culture-aware User Modeling. An analysis based on Chinese (Sina Weibo) and English (Twitter) microblogging data to compare users' microblogging behavior between Chinese users and Western (American) users, and relate findings to theories about cultural stereotypes developed in social sciences. More details here.
  • CrowdSense and the Twitter Incident Management (TIM) framework, a technology following the approach of U-Sem that allows to monitor in real-time Twitter messages for the purpose of increased public safety and security (TIM is actually a spin-off of the TUD WIS group).

U-Sem Services and Methods

  • Twitter-based User Modeling: a suite of U-Sem plug-ins that allow for understanding the semantics of short text snippets (tweets) published by a learner and deducing interest profiles from tweeting activities. [ more details ]
  • Knowledge Profiling: the knowledge profiling components deduce a learner's knowledge about different concepts by analyzing the learner's social activities. [ more details ]
  • Location Detection: given activities a user performs on the Social Web, the location detection component allows to create a location profile for a learner. [ more details ]
  • Faceted Search: this U-Sem service provides functionality to perform faceted search on the data generated by the learners. It particularly features also functionality to filter tweets that have been published by the learner. [ more details ]
  • ViewS: semantic augmentation of user generated content and visualisation of viewpoints [ more details ]
  • Multilingual Ontology Matching: content and user data may come in different languages and schemata. This components allows to align such multilingual ontological data. [more details ]
  • Domain-aware Ontology Matching: for matching heterogeneous data and schemata that originates from different applications and domains, U-Sem provides domain-aware mapping services. [ more details ]
  • Deriving Group Profiles from YouTube: a suite of services that mine the user-created content on the video social sharing site YouTube. [ more details ]
  • Language Detection: learners may speak different languages. This components infers the language skills of a given learner by analyzing her data traces. [ more details ]
  • Interactive User Modeling Dialogue: facilitates an interactive dialogue with the learner using semantically augmented content. [ more details ]
  • RDF Gears: the core orchestration engine of U-Sem is called RDF Gears. It allows designers and developers to orchestrate the functionality that is provided by the components above and to create customized augmented user modeling services. [ more details ]
  • Twinder: the search engine for Social Web streams of U-Sem is called Twinder. It allows users get information on user-defined topic from Social Web platform, especially the short messages like tweets and status messages. [ more details ]
  • Culture-aware User Modeling and Analysis: the Culture-aware User Modeling and Analysis framework allows for analyzing and comparing microblogging behavior for users from different cultural groups. [ more details ]
  • Learning style analysis: a framework that investigates the ability to extract user learning styles from social Web activities [ more details ]
  • Automatic User Account Matching: a framework that investigates the task of automatically identifying users across a number of social Web portals [ more details ]

To top

Selected Publications

For a full overview of publications, please have a look at the group's publication page here.

  1. Adam Moore, Gudrun Wesiak, Christina M. Steiner, Claudia Hauff, Declan Dagger, Gary Donohoe, Owen Conlan. Utilizing social neworks for user model priming: user attitudes. In Proceedings of UMAP 2013 Late-Breaking Results and Project Papers, Rome, Italy, June 10-14, 2013.
  2. Claudia Hauff. A Study on the Accuracy of Flickr's Geotag Data. In Proceedings of the 36th Annual ACM SIGIR Conference (SIGIR), Dublin, Ireland, July 2013.
  3. Claudia Hauff, Gerald Friedland Brave New Task: User Account Matching. In Working Notes Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, October 4-5, 2012.
  4. Fabian Abel, Qi Gao, Geert-Jan, Ke Tao. Twitter-based User Modeling for News Recommendations. In Proceedings of 23rd International Joint Conference on Artificial Intelligence (IJCAI2013). Beijing, China. August, 2013
  5. Gudrun Wesiak, Adam Moore, Christina M. Steiner, Claudia Hauff, Conor Gaffney, Declan Dagger, Dietrich Albert, Fionn Kelly, Gary Donohoe, Gordon Power, Owen Conlan. Affective Metacognitive Scaffolding and User Model Augmentation for Experiential Training Simulators: A Follow-up Study. In Proceedings of Eighth European Conference on Technology Enhanced Learning (EC-TEL 2013), Paphos, Cyprus, September 2013.
  6. Jan Hidders, Jacek Sroka, Paolo Missier. Report from the first workshop on scalable workflow enactment engines and technology (SWEET'12). Report of the First International Workshop on Scalable Workflow Enactment Engines and Technologies, SIGMOD Record 41(4): 60-64, 2012
  7. Ke Tao, Claudia Hauff, Geert-Jan Houben. Building a Microblog Corpus for Search Result Diversification. In Proceedings of 9th Asia Information Retrieval Societies Conference (AIRS 2013), Singapore, 2013.
  8. Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ujwal Gadiraju. Groundhog Day: Near-Duplicate Detection on Twitter. In Proceedings of 22nd International World Wide Web Conference, Rio de Janeiro, Brazil, 2013.
  9. Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ujwal Gadiraju. Twinder - Enhancing Twitter Search. PROMISE Winter School 2013: Bridging between Information Retrieval and Databases, Volume 8173 of Lecture Notes in Computer Science. Springer-Verlag.
  10. Marek Grabowski, Jan Hidders, Jacek Sroka. Representing MapReduce Optimisations in the Nested Relational Calculus. In Proceedings of Big Data - 29th British National Conference on Databases, BNCOD 2013, Oxford, UK, July 8-10, 2013.
  11. Pasquale De Meo, Emilio Ferrara, Fabian Abel, Lora Aroyo, Geert-Jan Houben. Analyzing User Behavior across Social Sharing Environments. ACM Transactions on Intelligent Systems and Technology, 2014 Vol. 5, No. 1
  12. Ke Tao, Claudia hauff, Fabian Abel, Geert-Jan Houben. Information Retrieval for Twitter Data. Twitter & Society, Weller, Katrin; Bruns, Axel; Burgess, Jean; Mahrt, Merja and Cornelius Puschmann (eds.), New York, NY: Peter Lang.
  13. Elaheh Momeni Roochi, Ke Tao, Bernhard Haslhofer, Geert-Jan Houben. Identification of Useful User Comments in Social Media: A Case Study on Flickr Commons. In Proceedings of the ACM IEEE Joint Conference on Digital Libraries (JCDL 2013), Indianapolis, USA, July 2013.
  14. C. Hauff and G. J. Houben. Geo-Location Estimation of Flickr Images: Social Web Based Enrichment”, in Proceedings of Advances in Information Retrieval - 34th European Conference on IR Research (ECIR 2012), Barcelona – Spain, 1-5 April 2012
  15. F. Abel, C. Hauff, G.-J. Houben, R. Stronkman and K. Tao. Twitcident: Fighting Fire with Information from Social Web Streams. Demo at 21st World Wide Web Conference (WWW 2012), Lyon – France, 16-20 April 2012
  16. Q. Gao, F. Abel, G.J. Houben and Y. Yu. Information propagation cultures on Sina Weibo and Twitter. In Proceedings of Web Science 2012, Evanston – USA, 22-24 June 2012
  17. F. Abel, C. Hauff, G.-J. Houben, R. Stronkman and K. Tao. Semantics + Filtering + Search = Twitcident: Exploring Information in Social Web Streams. In Proceedings of 23rd ACM Conference on Hypertext and Social Media, Milwaukee – USA, 25-28 June 2012
  18. Q. Gao, F. Abel, G.-J. Houben and Y. Yu. A Comparative Study of Users Microblogging Behavior on Sina Weibo and Twitter. In Proceedings of User Modeling, Adaptation, and Personalization - 20th International Conference (UMAP 2012), Montreal – Canada, 16-20 July 2012
  19. K. Tao, F. Abel, C. Hauff and G.-J. Houben. Twinder: a search engine for Twitter streams. In Proceedings of 12th International Conference, ICWE 2012, Berlin – Germany, 23-27 July 2012
  20. C. Hauff and Geert-Jan Houben. Placing images on the world map: a microblog-based enrichment approach. In Proceedings of 35th International ACM SIGIR Conference on Research and development in information retrieval, Portland, Oregon – USA, August 25th 2012
  21. E. Ilina, F. Abel and G.-J. Houben. Mining Twitter for Cultural Patterns. In Proceedings of ABIS 2012 Workshop on Personalization and Recommendation on the Web and Beyond, 2012, ABIS 2012 - Konstanz (Germany), 9-12 September
  22. Fabian Abel, Eelco Herder, Geert-Jan Houben, Nicola Henze, Daniel Krause. Cross-system User Modeling and Personalization on the Social Web. In P. Brusilovski, D. Chin (eds.): User Modeling and User-Adapted Interaction (UMUAI), Special Issue on Personalization in Social Web Systems, 2011 [bib]
  23. Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao. Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web. In Proceedings of 8th Extended Semantic Web Conference (ESWC2011), Heraklion, Crete, Greece, May 2011. [bib, pdf]
  24. Eric Feliksik. A data integration framework for the Semantic Web. Master thesis, TU Delft, 2011. [pdf]
  25. Dennis Spohr, Laura Hollink, Philipp Cimiano. Multilingual and Cross-Lingual Ontology Matching and its Application to Financial Accounting Standards. In Proceedings of 10th International Semantic Web Conference (ISWC), Bonn, Germany, October 2011.
  26. Fabian Abel, Ilknur Celik, Geert-Jan Houben, Patrick Siehndel. Leveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter. In Proceedings of 10th International Semantic Web Conference (ISWC), Bonn, Germany, October 2011 [bib, pdf]
  27. Kristian Slabbekoorn, Laura Hollink, Geert-Jan Houben. Domain-aware Matching of Events to DBpedia. In DeRiVE workshop on Detection, Representation, and Exploitation of Events in the Semantic Web at ISWC, Bonn, Germany, 2011.
  28. Qi Gao, Fabian Abel, Geert-Jan Houben, Ke Tao. Interweaving Trend and User Modeling for Personalized News Recommendation. In Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence (WI), Lyon, France, August 2011 [bib, pdf]
  29. Claudia Hauff and Geert-Jan Houben. Deriving Knowledge Profiles from Twitter. In Proceedings of 6th European conference on Technology enhanced learning: towards ubiquitous learning (EC-TEL), Palermo, Italy, September 2011 [pdf]
  30. Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao. Analyzing User Modeling on Twitter for Personalized News Recommendations. In Proceedings of International Conference on User Modeling, Adaptation and Personalization (UMAP), Girona, Spain, July 2011 [bib, pdf] (won best paper award at UMAP 2011)
  31. Ahmad Ammari, Vania Dimitrova, Dimoklis Despotakis. Semantically Enriched Machine Learning Approach to Filter YouTube Comments for Socially Augmented User Models. In Proceedings of the International Workshop on Augmenting User Models with Real World Experiences to Enhance Personalization and Adaptation, co-located with the International Conference on User Modeling, Adaptation and Personalization (UMAP2011), Girona, Spain, pp. 6
  32. Ilknur Celik, Fabian Abel, Patrick Siehndel. Adaptive Faceted Search on Twitter. In Proceedings of International Workshop on Semantic Adaptive Social Web (SASWeb), in connection with UMAP, Girona, Spain, July 2011 [bib, pdf]
  33. Fabian Abel, Samur Aurojo, Qi Gao, Geert-Jan Houben. Analyzing Cross-System User Modeling on the Social Web. In Proceedings of Eleventh International Conference on Web Engineering (ICWE), Paphos, Cyprus, June 2011 [bib, pdf]
  34. Ilknur Celik, Fabian Abel, Geert-Jan Houben. Learning Semantic Relationships between Entities in Twitter. In Proceedings of Eleventh International Conference on Web Engineering (ICWE), Paphos, Cyprus, June 2011 [bib, pdf]
  35. Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao. Analyzing Temporal Dynamics in Twitter Profiles for Personalized Recommendations in the Social Web. In Proceedings of Proceedings of ACM International Conference on Web Science (WebSci), Koblenz, Germany June 2011 [bib, pdf]
  36. Ilknur Celik, Fabian Abel, Patrick Siehndel. Towards a Framework for Adaptive Faceted Search on Twitter. In Proceedings of International Workshop on Dynamic and Adaptive Hypertext (DAH), in connection with ACM Hypertext, Eindhoven, The Netherlands, June 2011 [bib, pdf]
  37. Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao. Semantic Enrichment of Twitter Posts for User Profile Construction. In Proceedings of 8th Extended Semantic Web Conference (ESWC), Heraklion, Crete, Greece, May 2011 [bib, pdf]
  38. Ke Tao, Fabian Abel, Qi Gao, Geert-Jan Houben. TUMS: Twitter-based User Modeling Service. In Proceedings of the International Workshop on User Profile Data on the Social Semantic Web (UWeb), ESWC, Heraklion, Crete, Greece, May 2011 [bib, pdf]
  39. Fabian Abel, Ilknur Celik, Claudia Hauff, Laura Hollink, Geert-Jan Houben. U-Sem: Semantic Enrichment, User Modeling and Mining Usage Data on the Social Web. In Proceedings of International Workshop on Usage Analysis and the Web of Data (USEWOD), co-located with WWW '11, Hyderabad, India, March 2011 [bib, pdf]

To top