Web 2.0 platforms such as blogs, online news, social networks, and Internet forums allow users to write comments to express their interests and opinions about the content of news articles, videos, blogs or forum posts, etc. Users’ comments contain additional information about the content of Web documents as well as provide important means for user inter- actions. In this paper, we present a study on the task of recommending, for a given user, a short list of suitable stories for commenting.

An important development of the World Wide Web, known as the Web 2.0, is characterized by the explosion of user- generated content such as blogs, wikis, social networking sites, video sharing, online news, and more traditional Internet forums. In such self-publication settings, a user can publish articles or submit posts to share with others on any topic. Other users can view and comment on the posts and these comments can subsequently be viewed and commented on. Apparent benefits from self-publication media are that they facilitate more user engagement and enable social interactions between users. The success of self-publication media has been witnessed by their increasing popularity and the fast growing volume of information generated. At the same time, the increasing volume of self-published content presents new challenges, one of them is the additional information overload for users. Due to a large amount of posts and comments, finding valuable posts to read and comment has become a difficult and time-consuming task. Recently, predicting which posts are suitable for a specific user to read and comment, followed by personalized recommendations, has attracted significant research interest [2,3,5,28,29,37,38,45]. This kind of recommendations can alleviate the problem of information overload in self-publication media.

In this paper, we consider the task of recommending stories for users of forum-based media. Specifically, the problem is to provide, for each user, a personalized ordered list of new posts or stories that the user will likely to comment on (in this paper, “post” and “story” will be used interchangeably). The most popular and successful technique for generating such kind recommendation is collaborative filtering (CF) [3,20,36], in which recommendations are decided based on behaviors of users with similar tastes. There are several considerations when applying collaborative filtering to forum post recommendation. First, in forum-based media, comments appear frequently and the appearance of each comment requires the collaborative filtering algorithm to re-run on the updated data. In such a dynamic environment, it is desired to develop efficient algo- rithms able to make small online updates as new comments arrive [29]. Second, in many forum-based media types such as online news services, it is critical to recommend stories to users as soon as they are posted, as the freshness of stories is important for user interest. This requires the collaborative filtering algorithm to be able to handle new stories with a small number of comments. In the collaborative filtering literature, this is known as the “new item” problem and present challenges to most collaborative filtering algorithms [6,14,33]. Third, in addition to comments, there are other types of infor- mation available in an forum-based environments such as the content of stories, user ratings on comments, social attributes of users, etc. It is desired to combine such different types of information and different types of recommendation algorithms to make more accurate predictions about user preferences [20,37,39]. A successful recommender system for forum-based media must be able to address these considerations.

We address these problems with two solutions, which are two main contributions of this paper. As the first contribution, we propose a collaborative filtering method that exploits the comment behavior of users and between users to recommend posts. The proposed method is simple to implement and does not require full re-runs as new comments arrive, which makes it suitable for dynamic forum-based environments such as forums (the first problem mentioned above). We compare the proposed method with popular content-based filtering (CBF) methods that represent stories by vector space models such as TF–IDF scores [8] and topic distributions generated by a topic modeling technique [9]. We perform experiments to verify the ability of the proposed CF algorithm to recommend fresh stories with a small number of comments (the second problem highlighted above), when used individually and in combination with CBF algorithms.

As the second contribution, we propose a novel hybrid recommendation framework to combine different types of CBF and the proposed CF features to achieve improved prediction accuracy (the third problem highlighted above). The idea of hybrid recommendation is not new [1]. The simplest hybrid method performs CBF and CF separately and then combines the results, e.g. by taking the sum of CBF and CF ratings. Other hybrid methods use content-based information to augment the user-item matrix before applying CF algorithms [32], or integrate all content-based and collaborative data to learn prediction models [7,35]. Recently, Sun et al. [39] propose to use learning-to-rank for hybrid recommendation. They transform user and item data to a new, content-comparable representation before applying RankSVM to the new transformed features. In this work, we also propose using learning-to-rank to combine CBF and CF. However, our method is different from the method of Sun et al. and other hybrid methods in that we apply learning-to-rank on CBF and CF scores, not on user-item or content-item matrices. By doing so, our method can be used to integrate multiple scores generated by different CBF and CF algorithms. Furthermore, the combination weights are learned automatically using learning-to-rank algorithms. We present experimental results to show that our proposed learning-to-rank hybrid method performs better than traditional CBF methods and the new hybrid approach performs much better than both CBF and CF methods. When used with our CF method, the proposed hybrid method provides accurate recommendations while remaining efficient. Because the main aim of this work is to provide efficient and accurate recommendations of fresh stories in forum-based media, the two proposed methods are the two components of a comprehensive solution for that application.

The rest of this paper is organized as follows. Section 2 describes research background and related work. Section 3 presents recommendation methods, including traditional content-based filtering methods, our proposed collaborative filter- ing method, and a novel hybrid method using learning-to-rank. Section 4 describes our two datasets, experimental setup, and experimental results. Finally, Section 5 concludes the paper and discusses future work.

2. Related work and background

2.1. Basis of recommender systems

In the broader context, our solution is a specific case of recommender systems which have been studied for years with many successful deployments (Melville and Sindhwani [33]). A recommender system provides recommendations in the form of a short list of movies, news, books, products etc. that the user is likely interested in [4,11,13]. Recommender systems do not only help users in choosing products or services but also help manufactures in personalizing their advertising efforts by automating the generation of recommendations based on data analysis. The methods of designing recommender systems and their applications to real-world problems have been an active area of research [26,33,44,45]. Recommender systems depend on the domain and characteristics of the data available, and recommendation methods differ in the way they exploit these data sources to generate recommendations.

The following is a short summary of four main approaches in building recommender systems for the comparison purpose


  • Collaborative systems collect user feedback in the form of ratings for items in a given domain and exploit similari- ties in rating behavior among several users to generate recommendations. CF methods can be further subdivided into neighborhood-based, or memory-based, and model-based approaches. While neighborhood-based techniques select a subset of users based on their similarity to the active user and combine their ratings to produce predictions for this user, model-based techniques generate recommendations by estimating parameters of statistical models for user ratings.
  • Content-based systems generate recommendations by exploiting two information sources: the features associated with products and the ratings that a user has given them. Content-based recommenders consider the recommendation task as a user-specific classification problem and learn a classifier for the user’s likes and dislikes based on product features and the user profile.
  • Demographic recommender systems provide recommendations based on a demographic profile of the user. Recom- mended products can be produced for different demographic niches, by combining the ratings of users in those niches.
  • Knowledge-based recommender systems suggest products based on inferences about ausers needs and preferences. Thisknowledge will sometimes contain explicit functional knowledge about how certain product features meet user needs.

    When ratings from users are available, either in explicit or implicit forms, collaborative filtering is often the approach of choice [33,34]. However, collaborative filtering may suffer from data sparsity, i.e. each user provides only a small number of ratings for each item. To alleviate this problem, some recommender systems rely on trust scores provided by certain social networks to calculate a set of trusted users and use theirs opinions to make recommendations [31]. Such trust based ap- proach reduces the need on having a large number of ratings to define users with similar behaviors. However, it requires trust information. Because trust scores are not always available in online forums, we do not consider trust based recommen- dation in our work. Instead, we consider a hybrid approach that combines several types of collaborative and content-based scores.

2.2. Recommender systems for online media

CBF and CF techniques have been extensively used to provide recommendations for various types of online media. Per- sonalized recommendations for news are among the first systems of this line. Resnick et al. [36] present one of the first CF solutions for recommending netnews, which is known as memory-based CF. Das et al. [18] describe a large scale MapReduce based news recommender system for Google new service, which predicts news articles the user is likely to view by combin- ing three CF algorithms, namely MinHash clustering, Probabilistic Latent Semantic Indexing (PLSI), and covisitation counts. More recent works often use the hybrid approach. The system described in [29] combines content profiles of visited news articles with user lick history to provide improved recommendations for Google news. Li et al. [27] use both content and collaborative information in a two-stage framework: news articles are first clustered and topics are extracted from them; the topics are then used with access pattern of users to make recommendations. Chu and Park [14] also add user demographic information to address the cold-start problem.

The most closely related to our work are studies on recommending news and posts for commenting. There are two types of recommendations for commenting. The first type provides comments that the user likes [2,38], while the second type of system recommends news articles or posts that the user is likely to comment [3], which is the focus of this work. Perhaps the simplest form of such systems is just giving prediction about whether a post would receive more or less or no comments based on content and other metadata of the post [40]. Other systems provide explicit recommendations. Li et al. [28] propose to combine different features of comments such as authority, structural and semantic features with features of the posts to make recommendations that are general, not personalized for a specific user.

A more traditional recommender system that provides personalized predictions is presented by Shmueli et al. [37]. The system described in [37] predicts news stories that a given user is likely to comment. Recommendations are generated by combining content features of news articles such as tags with collaborative information in forms of co-commenters within a latent factor modeling framework. The work by Shmueli et al. [37] is the most related to our work. However, our work is different from their work in two aspects. First, we use a CF algorithm which is simple, fast to update as new comments arrive and yet effective, as shown by experiments. Second, our hybrid recommendation approach based on learning-to-rank can integrate multiple types of CBF and CF scores produced by different CBF and CF algorithms. As experiments show, this feature of the proposed hybrid method leads to improvements in recommendation accuracy.

2.3. Learning-to-rank

Learning-to-rank [25,30] is a type of supervised or semi-supervised machine learning problem, in which the goal is to automatically construct a ranking model from training data. This ranking model then can be used to produce a permutation of items in new, unseen lists in a way, which is similar to rankings in the training data in some senses. Learning-to-rank algorithms have been applied in areas other than information retrieval, i.e. machine translation [42], recommender system [23,39], etc. In the following we make a comparison with other traditional tasks such as classification and regression in terms of the input, the output and the learning goals [25].

  • Classification. The input of a classification problem is a feature vector x ∈ Rd, and the output is a label y ∈ Y, where Y is a set of predefined labels. In classification, the goal is to learn a classifier f(x) which can determine a class label y of a given feature vector x.
  • Regression. In regression, the input is a feature vector x ∈ Rd, the output is a real number y∈ R, and the goal is to learn a function f(x) which can determine a real number y of a given feature vector x.
  • Ordinal classification or ordinal regression. This is close to ranking, but is also different. The input is a feature vector x ∈ Rd, the output is a label y ∈ Y, representing a grade where Y is a set of grade labels. The goal of learning is to learn a model f(x) which can determine the grade label y of a given feature vector x. The model first calculates the score f(x), and then it decides the grade label y using some thresholds. Specifically, the model segments the real number axis into some intervals and assigns to each interval a grade. It then takes the grade of the interval which f(x) falls into as the grade of x.
  • Learning-to-rank. The input is a list of items, the output is the item list in a specific order, and the goal is to build a ranking model which can order items. In ranking, one cares more about accurate ordering of objects, while in ordinal classification, one cares more about accurate ordered-categorization of objects.

Learning to rank approaches fall in one of three main categories: pointwise, pairwise approach, and listwise. In the pointwise approach, the ranking task is treated as a traditional regression problem. The order of training data is transformed to scores and a regression model is trained to predict a score for new objects. Then, the objects are ranked based on the predicted scores. In addition to regression, classification and ordinal classification can also be used for the pointwise approach.

In the pairwise approach, ranking is transformed to traditional binary classification. Each pair of objects (x1, x2) is classic- failed to have a label +1 if x1 is ranked higher than x2, or 1 otherwise. Thus, the training set contains a collection of samples; each consists a pair of feature vectors (x1, x2 ) and a label y ∈ {+1, 1} (here we overload the notation by using xi to denote both the object and its feature representation). Once every object pairs have been ordered in this way, it is straightforward to construct the final ranked list.

In the listwise approach, the ranking structure is explicitly considered by using ranking lists as instances in training and prediction. This approach requires a measure of similarity between two ranking lists when computing the loss function. An advantage of this approach is that evaluation measures can be directly incorporated into the loss function. However, the algorithms of this approach are quite computationally expensive.

Many learning-to-rank algorithms have been proposed, including pointwise methods such as SLR [15], Pranking [17]; pair- wise methods such as RankSVM [22], RankBoost [19], LambdaRank [10]; and listwise methods such as ListNet [12], ListMLE [43], BoltzRank [41], BayesRank [24]. Below we give a brief introduction to three learning-to-rank algorithms including a pairwise algorithm, i.e. RankSVM, and two listwise algorithms, i.e. ListNet and ListMLE. These algorithms are fast, simple, and have been shown to be effective in benchmark datasets [12,43].

  • RankSVM: RankSVM [21,22] is one of the first learning-to-rank methods, which transforms a ranking task into pairwise classification and employs Support Vector machine (SVM) [16] to perform the learning task. Formally, RankSVM computes x1 − x2, where x1 and x2 are the feature vectors of a pair of objects, and then use SVM to classify (x1 − x2) into +1 or 1.
  • ListNet: ListNet is a listwise method for learning-to-rank tasks proposed by Cao et al. [12]. ListNet uses Cross-entropy metric as loss function, Neural Network as model, and Gradient Descent as learning algorithm. The learning algorithm of the ListNet method is an online learning algorithm, which performs in T iterations. In each iteration, we input a training sample one by one to the neural network model, compute the output, and then update the parameter of the model.
  • ListMLE: ListMLE [43] is a listwise method which uses the likelihood loss as the loss function. Like the ListNet method, it also uses Neural Network as model. The core of the ListMLE algorithm also consists of two main steps, which compute the score list for the sample using the current value of parameter vector, and then update the parameter vector using gradient.

3. Conclusions and future work

We have presented a study on personalized recommendation for forum users. We investigated and compared several CBF methods, and introduced a new CF algorithm. The proposed CF algorithm relies on co-commenting relationships among users to make predictions. It has been designed to be simple and fast, which is important to cope with the dynamic nature of forums. We also introduced a novel hybrid recommendation method, which combines content-based and collaborative features using a learning-to-rank framework. The effectiveness of the investigated methods has been judged in datasets collected from an online forum and a forum-based news service. The results show that (1) the CBF method with topic-based representation performs better than a random baseline; (2) the CF method outperforms the baseline and the CBF methods by large margins, and the CF method remains accurate when dealing with new posts receiving just a few comments; (3) the hybrid method using learning-to-rank is the most accurate and outperforms both CBF and CF methods.

Our framework is general in the sense that it can combine different types of features and is not restricted to any data. Moreover, the combination weights are set automatically by using a learning-to-rank algorithm. So the proposed method is easy to be adapted to other recommendation tasks in real-work applications such as recommending electronic devices, music, movies, books, or hotels.

There are several ways to extend the current work. First, it is useful to investigate the influence of comment content and comment relationships, for example how comments on comments are different from comments on posts. Second, some forums provide additional information such as ratings, which may be useful to integrate into the prediction model. We plan to make such extensions in future work.


  1. [1]  G. Adomavicius, A. Tuzhilin, Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions, IEEE Trans. Knowl. Data Eng. 17 (6) (2005) 734–749.
  2. [2]  D. Agarwal, B. Chen, B. Pang, Personalized recommendation of user comments via factor models, in: Proceedings of the 2011 IEEE Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011, pp. 571–582.
  3. [3]  M. Aharon, A. Kagian, Y. Koren, R. Lempel, Dynamic personalized recommendation of comment-eliciting stories, in: Proceedings of Sixth ACM Confer- ence on Recommender Systems, 2012, pp. 209–212.
  4. [4]  A. Antikacioglu, R. Ravi, S. Sridhar, Recommendation subgraphs for web discovery, in: Proceedings of the Twenty-fourth International Conference on World Wide Web (WWW), 2015, pp. 77–87.
  5. [5]  T. Bansal, M. Das, C. Bhattacharyya, Content driven user profiling for comment-worthy recommendations of news and blog articles, in: Proceedings of the Ninth ACM Conference on Recommender Systems, 2015, pp. 195–202.
  1. [6]  I. Barjasteh, R. Forsati, F. Masrour, A. Esfahanian, H. Radha, Cold-start item and user recommendation with decoupled completion and transduction, in: Proceedings of Ninth ACM Conference on Recommender Systems, 2015, pp. 91–98.
  2. [7]  J. Basilico, T. Hofmann, Unifying collaborative and content-based filtering, in: Proceedings of the Twenty-first International Conference on Machine Learning (ICML), 2004.
  3. [8]  D. Billsus, M. Pazzani, A personal news agent that talks, learns and explains, in: Proceedings of the Third Annual Conference on Autonomous Agents, 1999, pp. 268–275.
  4. [9]  D. Blei, A. Ng, M. Jordan, Latent Dirichlet allocation, J. Mach. Learn. Res. 3 (2003) 993–1022.
  5. [10]  C. Burges, R. Ragno, Q. Le, Learning to rank with nonsmooth cost functions, in: Proceedings of the 2006 Advances in Neural Information ProcessingSystems (NIPS), 2006, pp. 193–200.
  6. [11]  R. Burke, Hybrid web recommender systems, in: The Adaptive Web, 2007, pp. 377–408.
  7. [12]  Z. Cao, T. Qin, T. Liu, M. Tsai, H. Li, Learning to rank: From pairwise approach to listwise approach, in: Proceedings of the Twenty-fourth InternationalConference on Machine Learning (ICML), 2007, pp. 129–136.
  8. [13]  A. Chaney, D. Blei, T. Eliassi-Rad, A probabilistic model for using social networks in personalized item recommendation, in: Proceedings of the NinthACM Conference on Recommender Systems, 2015, pp. 43–50.
  9. [14]  W. Chu, S. Park, Personalized recommendation on dynamic content using predictive bilinear models, in: Proceedings of the Eighteenth InternationalConference on World Wide Web (WWW), 2009, pp. 691–700.
  10. [15]  W.S. Cooper, F.C. Gey, D.P. Dabney, Probabilistic retrieval based on staged logistic regression, in: Proceedings of the Fifteenth Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, 1992, pp. 198–210.
  11. [16]  C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273–297.
  12. [17]  K. Crammer, Y. Singer, Pranking with Ranking, in: Proceedings of the 2001 Advances in Neural Information Processing Systems (NIPS), 2001,pp. 641–647.
  13. [18]  A. Das, M. Datar, A. Garg, S. Rajaram, Google news personalization: scalable online collaborative filtering, in: Proceedings of the Sixteenth InternationalConference on World Wide Web (WWW), 2007, pp. 271–280.
  14. [19]  Y. Freund, R. Iyer, R.E. Schapire, Y. Singer, An efficient boosting algorithm for combining preferences, J. Mach. Learn. Res. 4 (2003) 933–969.
  15. [20]  J. Hannon, M. Bennett, B. Smyth, Recommending Twitter users to follow using content and collaborative filtering approaches, in: Proceedings of theFourth ACM Conference on Recommender Systems, 2010, pp. 199–206.
  16. [21]  R. Herbrich, T. Graepel, K. Obermayer, Large Margin Rank Boundaries for Ordinal Regression, Advances in Large Margin Classifiers, Chapter 7, MITPress, 2000, pp. 115–132.
  17. [22]  T. Joachims, Optimizing search engines using clickthrough data, in: Proceedings of the 2002 ACM Conference on Knowledge Discovery and Data Mining(KDD), 2002, pp. 133–142.
  18. [23]  M. Kahng, S. Lee, S. Lee, Ranking in context-aware recommender systems, in: Proceedings of the Twentieth International Conference Companion onWorld Wide Web (WWW), 2011, pp. 65–66.
  19. [24]  J. Kuo, P. Cheng, H. Wang, Learning to rank from Bayesian decision inference, in: Proceedings of the Eighteenth ACM Conference on Information andKnowledge Management (CIKM), 2009, pp. 827–836.
  20. [25]  H. Li, Learning to Rank for Information Retrieval and Natural Language Processing, Synthesis Lectures on Human Language Technologies, Morgan &Claypool, 2011.
  21. [26]  L. Li, H. Tong, N. Cao, K. Ehrlich, Y. Lin, N. Buchler, Replacing the irreplaceable: Fast algorithms for team member recommendation, in: Proceedings ofthe Twenty-fourth International Conference on World Wide Web (WWW), 2015, pp. 636–646.
  22. [27]  L. Li, D. Wang, T. Li, D. Knox, B. Padmanabhan, Scene: A scalable two-stage personalized news recommendation system, in: Proceedings of the Thirty–fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011, pp. 125–134.
  23. [28]  Q. Li, J. Wang, Y. Chen, Z. Lin, User comments for news recommendation in forum-based social media, Inf. Sci. 180 (2010) 4929–4939.
  24. [29]  J. Liu, P. Dolan, E. Pedersen, Personalized news recommendation based on click behavior, in: Proceedings of Fifteenth International Conference onIntelligent User Interfaces, 2010, pp. 31–40.
  25. [30]  T. Liu, Learning to Rank for Information Retrieval, Springer, 2011.
  26. [31]  C. Martinez-Cruz, C. Porcel, J. Bernabe-Moreno, E. Herrera-Viedma, A model to represent users trust in recommender systems using ontologies andfuzzy linguistic modeling, Inf. Sci. 311 (2015) 102–118.
  27. [32]  P. Melville, R.J. Mooney, R. Nagarajan, Content-boosted collaborative filtering for improved recommendations, in: Proceedings of the Eighteenth National Conference on Artificial Intelligence, 2002, pp. 187–192.
  28. [33]  P. Melville, V. Sindhwani, Recommender systems, in: Encyclopedia of Machine Learning, 2010.
  29. [34]  W. Pan, S. Xia, Z. Liu, X. Peng, Z. Ming, Mixed factorization for collaborative recommendation with heterogeneous explicit feedbacks, Inf. Sci. 332(2016) 84–93.
  30. [35]  N.D. Phuong, L.Q. Thang, T.M. Phuong, A graph-based method for combining collaborative and content-based filtering, in: Proceedings of the TenthPacific Rim International Conference on Artificial Intelligence (PRICAI), Springer Berlin Heidelberg, 2008, pp. 859–869.
  31. [36]  P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl, Grouplens: An open architecture for collaborative filtering of netnews, in: Proceedings of the1994 ACM Conference on Computer Supported Cooperative Work (CSCW), 1994, pp. 175–186.
  32. [37]  E. Shmueli, A. Kagian, Y. Koren, R. Lempel, Care to comment? recommendations for commenting on news stories, in: Proceedings of the Twenty-firstInternational Conference on World Wide Web (WWW), 2012, pp. 429–438.
  33. [38]  S. Siersdorfer, S. Chelaru, J.S. Pedro, I.S. Altingovde, W. Nejdl, Analyzing and mining comments and comment ratings on the social web, ACM Trans.Web 8 (3) (2014) 17:1–17:39.
  34. [39]  J. Sun, S. Wang, B.J. Gao, J. Ma, Learning to rank for hybrid recommendation, in: Proceedings of the Twenty-first ACM International Conference onInformation and Knowledge Management (CIKM), 2012, pp. 2239–2242.
  35. [40]  M. Tsagkias, W. Weerkamp, M. Rijke, Predicting the volume of comments on online news stories, in: Proceedings of the Eighteenth ACM Conferenceon Information and Knowledge Management (CIKM), 2009, pp. 1765–1768.
  36. [41]  M.N. Volkovs, R.S. Zemel, Boltzrank: Learning to maximize expected ranking gain, in: Proceedings of the Twenty-sixth Annual International Conferenceon Machine Learning (ICML), 2009, pp. 1089–1096.
  37. [42]  T. Watanabe, Optimized online rank learning for machine translation, in: Proceedings of the 2012 Conference of the North American Chapter of theAssociation for Computational Linguistics: Human Language Technologies, NAACL HLT, 2012, pp. 253–262.
  38. [43]  F. Xia, T. Liu, J. Wang, W. Zhang, H. Li, Listwise approach to learning to rank: Theory and algorithm, in: Proceedings of the Twenty-fifth InternationalConference on Machine Learning (ICML), 2008, pp. 1192–1199.
  39. [44]  F. Zhang, K. Zheng, N. Yuan, X. Xie, E. Chen, X. Zhou, A novelty-seeking based dining recommender system, in: Proceedings of the Twenty-fourthInternational Conference on World Wide Web (WWW), 2015, pp. 1362–1372.
  40. [45]  X. Zhou, S. Wu, C. Chen, G. Chen, S. Ying, Real-time recommendation for microblogs, Inf. Sci. 279 (2014) 301–325.
For full paper, please contact author: [email protected] 
Related posts: