In the seminar phase I considered theoretical bases of Visual Analytics and those aspects of it, which will be applied in our project. In particular: geospatial and time factor visualization, plagiarism visualization, social networks visualization, visualization of scientific collaboration, perception and cognitive aspects. Additionally, I offered some ideas on what can be implemented during the project group.
Tag Archives: seminarphase
Today, I want to introduce my ideas for recommenders for my prototype to you. Basically, I want to create a simple web application with JSP which provides different recommenders and corresponding visualizations based on the given mysql dump. Following is a list of the recommenders that I am thinking of with a few thoughts on the implementation and sketches for possible visualizations.
1.) recommend papers based on citations as boolean preferences between papers (collaborative filtering)
implementation with mahout: Create a datamodel based on boolean preferences (as in an association exist or does not) and then run the recommender with different similarity metrics contained in mahout (that can work with boolean preferences), evaluate and compare them.
2.) recommend papers based on cocitation
implementation: If I understand the contents of the co_citation view correctly (count of the cocitations between two papers), this would simply be a maximum search with the ID of the input paper as one of the IDs in the view.
3.) recommend papers based on bibliographic coupling
implementation: Again, if I understand the contents of the bib_coupling view correctly (count of the bibliographic couplings between two papers), this would simply be a maximum search with the ID of the input paper as one of the IDs in the view.
(A denotes the recommendation, # the number of bibliographic couplings between the input paper and A)
4.) recommend papers based on common keywords
implementation with mahout: Create an item-based recommender and create an ItemSimilarity class which computes the similarity between two papers based on their shared keywords.
5.) recommend people based on co-authorship (collaborative filtering)
implementation with mahout: Co-authorship as preferences between authors (so people who have often written together have a high preference for each other), user-based recommender to find similar people
6.) recommend people based on event participation
implementation with mahout: Again co-authorship as preferences between authors, item-based recommender (create an ItemSimilarity class which computes the similarity between two authors based on their common event participations), recommendations should then be something like authors who often participated in the same events as the input author and/or his co-authors but never wrote a paper together with the input author
I explained the rules for the seminar article many time before, but now I put the reference also in the blog, so anyone can refer to this post in case of any questions.
- The SVN holds a template of the seminar article.
- You have to use LaTeX to produce the article.
- You have to continuously update your article in the SVN.
- You have to present a preliminary outline of your seminar article to the supervisors until 02.12.2011 latest.
- Present your ideas for your prototype until 09.12.2011 latest.
- The final version of the seminar article has to be ready 31.01.2012.
- The minimum length of your article (including figures, tables and references) is 16 pages. The maximum length is 20 pages.
The seminar presentations will be in mid-January. Each one of you will have to present his topic and prototype for around 25 minutes. After the presentation we will have a round of questions.
This topic is divided into 3 parts viz.
1. Trend detection in numbers:
More: Moving average and its classification, predictive analysis and forecasts, visualization.
Example: Stock markets
Useful tool(s): MS Excel
2. Trend detection in text:
More: Term document matrix, comparisons, how can we achieve it, mathematical models, can we use Java/C#, visualizations, ThemeRiver.
3. Custom Search Applications:
More: Apache Solr, web services, semantic search, possible linked data extensions.
Currently I am reading whatever I come across about Trend detection. I am also learning various techniques with MS Excel. I have added a few papers those I found useful to Dropbox(links provided below).
Access e-books at Dropbox and papers at Mendeley in folder “Trend Detection” under PG PUSHPIN group.