Image reranking is effective for improving the performance of a text-based image search. However, existing reranking algorithms are limited for two main reasons: 1) the textual meta-data associated with images is often mismatched with their actual visual content and 2) the extracted visual features do not accurately describe the semantic similarities between images. Recently, user click information has been used in image reranking, because clicks have been shown to more accurately describe the relevance of retrieved images to search queries. However, a critical problem for click-based methods is the lack of click data, since only a small number of web images have actually been clicked on by users. Therefore, we aim to solve this problem by predicting image clicks. We propose a multimodal hypergraph learning-based sparse coding method for image click prediction, and apply the obtained click data to the reranking of images. We adopt a hypergraph to build a group of manifolds, which explore the complementarily of different features through a group of weights. Unlike a graph that has an edge between two vertices, a hyperedge in a hypergraph connects a set of vertices, and helps preserve the local smoothness of the constructed sparse codes. An alternating optimization procedure is then performed, and the weights of different modalities and the sparse codes are simultaneously obtained. Finally, a voting strategy is used to describe the predicted click as a binary event (click or no click), from the images’ corresponding sparse codes. Thorough empirical studies on a large-scale database including nearly 330K images demonstrate the effectiveness of our approach for click prediction when compared with several other methods. Additional image re-ranking experiments on real world data show the use of click prediction is beneficial to improving the performance of prominent graph-based image re-ranking algorithms.
Most existing re-ranking methods use a tool known as pseudo-relevance feedback (PRF), where a proportion of the top-ranked images are assumed to be relevant, and subsequently used to build a model for re-ranking. This is in contrast to relevance feedback, where users explicitly provide feedback by labeling the top results as positive or negative. In the classification-based PRF method, the top-ranked images are regarded as pseudo-positive, and low-ranked images regarded as pseudo-negative examples to train a classifier, and then re-rank. Hsu et al. also adopt this pseudo-positive and pseudo-negative image method to develop a clustering-based re-ranking algorithm.
Disadvantages of Existing System:
One major problem impacting performance is the mismatches between the actual content of image and the textual data on the web page. The problem with these methods is the reliability of the obtained pseudo-positive and pseudo-negative images is not guaranteed. PROPOSED SYSTEM: In this paper we propose a novel method named multimodal hyper graph learning-based sparse coding for click prediction, and apply the predicted clicks to re-rank web images. Both strategies of early and late fusion of multiple features are used in this method through three main steps. We construct a web image base with associated click annotation, collected from a commercial search engine. The search engine has recorded clicks for each image. Indicate that the images with high clicks are strongly relevant to the queries, while present non-relevant images with zero clicks. These two components form the image bases. We consider both early and late fusion in the proposed objective function. The early fusion is realized by directly concatenating multiple visual features, and is applied in the sparse coding term. Late fusion is accomplished in the manifold learning term. For web images without clicks, we implement hyper graph learning to construct a group of manifolds, which preserves local smoothness using hyper edges. Unlike a graph that has an edge between two vertices, a set of vertices are connected by the hyper edge in a hyper graph. Common graph-based learning methods usually only consider the pair wise relationship between two vertices, ignoring the higher-order relationship among three or more vertices. Using this term can help the proposed method preserve the local smoothness of the constructed sparse codes. Finally, an alternating optimization procedure is conducted to explore the complementary nature of different modalities. The weights of different modalities and the sparse codes are simultaneously obtained using this optimization strategy. A voting strategy is then adopted to predict if an input image will be clicked or not, based on its sparse code.
Advantages of Proposed System:
We effectively utilize search engine derived images annotated with clicks, and successfully predict the clicks for new input images without clicks. Based on the obtained clicks, we re-rank the images, a strategy which could be beneficial for improving commercial image searching. Second, we propose a novel method named multimodal hyper graph learning-based sparse coding. This method uses both early and late fusion in multimodal learning. By simultaneously learning the sparse codes and the weights of different hyper graphs, the performance of sparse coding performs significantly.
Reference: Jun Yu, Member, IEEE, Yong Rui, Fellow, IEEE, and Dacheng Tao, Senior Member, IEEE “Click Prediction for Web Image Reranking Using Multimodal Sparse Coding” IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 5, MAY 2014