You are on page 1of 20

VISUAL CONTENT PATTERN ANALYSING AND IMAGE RETRIEVAL

Major Area

: Data Mining

Reg. No

: 2010231051

Name of the student

: S.Vignesh

Name of the guide

: Dr.S.Valli Associate Professor, Department of CSE, Anna University, Chennai.

CHAPTER 1 INTRODUCTION In this decade, there is a rapid growth in the use of digital media such as images, video, and audio. As the use of digital media increases, effective retrieval and management techniques become more important. Such techniques are required to facilitate the effective searching and browsing of large multimedia databases. In order to respond to this need, researchers have tried extending Information Retrieval (IR) techniques used in text retrieval to the area of image retrieval. The purpose of an image database is to store and retrieve an image or image sequences that are relevant to a query. There are a variety of domains such as information retrieval, computer graphics, database management and user behaviour which have evolved separately but are interrelated and provide a valuable contribution to this research subject. As more and more visual information is available in digital archives, the need for effective image retrieval has become clear. In image retrieval research, researchers are moving from keyword based, to content based then towards semantic based image retrieval and the main problem encountered in the content based image retrieval research is the semantic gap between the low-level feature representing and high-level semantics in the images. Majority of images in real world are stored as raster images. Image can be viewed as vector of pixels; every pixel is described by its color. The vector of pixel represents some kind of keywords in image. But human observer extracts from image important features that dene semantics of image for him. The man doesnt think about pixel but about persons or objects on image. So we need technique that is able to extract this features and that is resistant to minor changes of images e.g. amount of light, contrast and moves of objects on the images. Direct usage of keyword based systems leads to results that are sensitive to small change of any keyword. Text-based image retrieval techniques employ text to describe the content of the image which often causes ambiguity and inadequacy in performing an image database search and query processing. This problem is due to the difficulty in specifying exact terms and phrases in describing the content of images as the content of an image is much richer than what any set of keywords can express. Since the textual annotations are based on language, variations in annotation will pose challenges to image retrieval Text-based image retrieval techniques employ text to describe the content of the image which often causes ambiguity

and inadequacy in performing an image database search and query processing. This problem is due to the difficulty in specifying exact terms and phrases in describing the content of images as the content of an image is much richer than what any set of keywords can express. Since the textual annotations are based on language, variations in annotation will pose challenges to image retrieval. Although there are many sophisticated algorithms to describe colour, shape and texture features approaches, these algorithms do not satisfied and comfort to human perception. This is mainly due to the unavailability of low level image features in describing high level concepts in the users mind. For an example finding an image of a little boy is playing a ball in the garden. The only way a machine is able to perform automatic extraction is by extracting the low level features that represented by the colour, texture, shape and spatial from images with a good degree of efficiency. Content-based image retrieval, the problem of searching large image repositories according to their content, has been the subject of a significant amount of computer vision research in the recent past. While early retrieval architectures were based on the query-byexample paradigm, which formulates image retrieval as the search for the best database match to a user-provided query image, it was quickly realized that the design of fully functional retrieval systems would require support for semantic queries. These are systems where the database of images are annotated with semantic keywords, enabling the user to specify the query through a natural language description of the visual concepts of interest. This realization, combined with the cost of manual image labelling, generated significant interest in the problem of automatically extracting semantic descriptors from image. This approach, the visual content analyser, can be used in many practical real time applications which involves high complexity such as crime science. A set of keywords are assigned to each image then. In this approach the key words are fed into the system manually to the system and then the machine learning concepts are utilized such that the accuracy of the system grows with the number of training set of images. Content-based image retrieval using semantic features is another recent solution for the semantic gap. Semantic features differ from visual features in many aspects. While visual features are general and could be used with different image types and modalities, semantic

features are domain specific. For example in the domain of Lung CT, a combination of visual features may be used such as gray-scale histograms and wavelet transforms coefficients to compose the feature vector. On the other hand a semantic feature vector may be composed of the semantic categories existing in the image such as Soft tissue, Lung tissue, Heart etc. While gray-scale histograms and wavelet transforms coefficients are common features that could be used to describe other image modalities, semantic features mentioned above are suitable only for Lung CTs. The concept of semantic features may be ambiguous since computers are not yet able to capture semantics. In fact semantic features are based on visual features but only in the early steps of processing. Content-based image retrieval uses the visual contents of an image such as color, shape, texture, and spatial layout to represent and index the image. In typical content-based image retrieval systems, the visual contents of the images in the database are extracted and described by multi-dimensional feature vectors. The feature vectors of the images in the database form a feature database. To retrieve images, users provide the retrieval system with example images or sketched figures. The system then changes these examples into its internal representation of feature vectors. The similarities/distances between the feature vectors of the query example or sketch and those of the images in the database are then calculated and retrieval is performed with the aid of an indexing scheme. The indexing scheme provides an efficient way to search for the image database. Recent retrieval systems have incorporated users' relevance feedback to modify the retrieval process in order to generate perceptually and semantically more meaningful retrieval results. The biggest issue for CBIR system is to incorporate versatile techniques so as to process images of diversified characteristics and categories. Many techniques for processing of low level cues are distinguished by the characteristics of domain-images. The performance of these techniques is challenged by various factors like image resolution, intra-image illumination variations, non-homogeneity of intra-region and inter-region textures, multiple and occluded objects etc. The other major difficulty, described as semantic-gap in the literature, is a gap between inferred understanding / semantics by pixel domain processing using low level cues and human perceptions of visual cues of given image. In other words, there exists a gap between mapping of extracted features and human perceived semantics. The dimensionality of the difficulty becomes adverse because of subjectivity in the visually perceived semantics,

making image content description a subjective phenomenon of human perception, characterized by human psychology, emotions, and imaginations. The road map of development of CBIR techniques began with simple primitive features based indexing methodologies that later got enhanced with combinational features. Two major issues, semantic-gap and subjectivity of semantics are addressed by the state of the art techniques. Many state of the art techniques incorporate iterative relevance feedback from user for refinement of results. Semantic gap bridging approaches based on fuzzy, evolutionary and neural network have also been reported. Hierarchical approaches for feature extraction and representations achieve hierarchical abstraction; help matching semantics of visual perception of human beings. Several modern techniques focus on improvements on processing of low level cues so as to precisely extract features. We intend to propose that prominent boundary detection based hierarchical approach with region feature extraction would significantly improve the quality of retrieval results. Many state of the art techniques suggest that semantic domain based image retrieval systems, comparing meaningful concepts improve quality of retrieved image set. The image retrieval system comprises of multiple inter-dependent tasks performed by various phases. Inter-tuning of all these phases of the retrieval system is inevitable for over all good results. The diversity in the images and semantic-gap generally enforce parameter tuning & threshold-value specification suiting to the requirements. For development of a real time CBIR system, feature processing time and query response time should be optimized. A better performance can be achieved if feature-dimensionality and space complexity of the algorithms are optimized. Specific issues, pertaining to application domains are to be addressed for meeting application-specific requirements. In 2000s, semantic based image retrieval has been introduced. This is due to neither a single features nor a combination of multiple visual features could fully capture high level concept of images. Besides, the performance of image retrieval system based on low level features are not satisfactory, there is a need for the mainstream of the research converges to retrieve based on semantic meaning by trying to extract the cognitive concept of a human to map the low level image features to high level concept . In addition, representing the image content with semantic terms allows users to access images through text query which is more intuitive, easier and preferred by the front end users to express their mind compare with using images.

Semantic content representation has been identified as an important issue to bridge the semantic gap in visual information access. It has been addressed as a good description and representation of an image, it able to capture meaningful contents of the image. Current researches often represent images in terms of labelled regions or images, but pay little attention to the spatial positions or relationships between those regions or objects. Spatial relationship is needed in order to further increase the confidence in image understanding. Besides, users preferred to express their information needs at the semantic level instead of the level of preliminary image features. Moreover textual queries usually provide more accurate description of users information needs. Performance of traditional content-based image retrieval systems is far from users expectation due to the semantic gap between low-level visual features and the richness of human semantics. In attempt to reduce the semantic gap, a region-based image retrieval system with high-level semantic colour names can be used. For this, database images are segmented into colour-texture homogeneous regions. For each region, we define a colour name as that used in our daily life. In the retrieval process, images containing regions of same colour name as that of the query are selected as candidates.. In this way, the system reduces the semantic gap between numerical image features and the rich semantics in the users mind.

1.1 OBJECTIVE The objective is to provide a semantic search in a composite approach with improved accuracy and speed. The varied approach guarantees an efficient system that produces more appropriate results. 1.2 NOVEL IDEA Retrieving the images that are similar to the query image along with the category of the query content and the category of the matching categories of the search results will add more meaning to the result and will result in a better system.

CHAPTER 2 LITERATURE SURVEY [1] Block truncation Coding based features is one of the CBIR methods proposed using color features of image. The approach basically considers red, green and blue planes of image together to compute feature vector. The color averaging methods used here are BTC Level-1, BTC Level-2, BTC Level-3.Here the feature vector size per image is greatly reduced by using mean of each plane and find out the threshold value then divide each plane using threshold value, color averaging is applied to calculate precision and recall to calculate the performance of the algorithm. [2] A new bag-based re-ranking framework for large-scale TBIR. Specifically, we first cluster relevant images using both textual and visual features. By treating each cluster as a bag and the images in the bag as instances, the problem is formulated as a multiinstance (MI) learning problem. MI learning methods such as mi-SVM can be readily incorporated into our bag-based re-ranking framework. Observing that at least a certain portion of a positive bag is of positive instances, a negative bag might also contain positive instances. To address the ambiguities on the instance labels in the positive and negative bags under this GMI setting, a new method referred to as GMI-SVM to enhance retrieval performance by propagating the labels from the bag level to the instance level is developed. [3] A novel idea of sectoring of DCT-DST plane of Row wise transformed images and feature vector generation with and without augmentation of zeroth column component of DCT transformed image and the last column component of DST transformed image. Two similarity measures namely sum of absolute difference and Euclidean distance are used and results of them are compared. The cross over point performance of overall average of precision and recall for different sector sizes are studied and analyzed comparatively. The augmented Wang[20] image database of 1055 images is used, consisting of 12 different classes. DCT/DST plane has been divided into 4,8,12 and 16 sectors. The overall average precision and recall cross over point with augmentation of average of zeroth column from DCT transformed image gives better result of retrieval. The use of sum of Absolute difference as similarity measure always gives lesser computational complexity and better relevant image retrieval rate compared to Euclidian distance.

[4] A new form of a semi-supervised C-SVM algorithm that exploits the intrinsic data distribution by working directly on equiprobable envelopes of Gaussian mixture components. Second, we introduce an active learning strategy which allows to interactively adjust the equiprobable envelopes in a small number of feedback steps. The proposed method allows the exploitation of the information contained in the unlabeled data and does not suffer from the drawbacks inherent to semi-supervised methods, e.g. computation time and memory requirements. Tests performed on a database of high-resolution satellite images and on a database of color images show that our system compares favourably, in terms of learning speed and ability to manage large volumes of data, to the classic approach using SVM active learning. The Images with small semantic gaps are selected and clustered first by defining a confidence score [5] and a content-context similarity matrix in visual space and textual space. Then, from the surrounding descriptions (titles, categories, and comments) of these images, concepts with small semantic gaps are automatically mined. In addition, considering that semantic gap analysis depends on both features and content-contextual consistency, a lexicon family of high-level concepts with small semantic gaps (LCSS) based on different low-level features and different consistency measurements are constructed. This set of lexica is both independent to each other and mutually complimentary. LCSS is very helpful for data collection, feature selection, annotation and modelling for large-scale image retrieval. Principal component analysis (PCA) [6] is applied to extract significant image features. It is incorporated with the proposed two-phase fuzzy adaptive resonance theory neural network (Fuzzy-ART) for image content classification to overcome the gap between the low level features and high level semantic concepts. In general, Fuzzy-ART is an unsupervised clustering. Meanwhile, the training patterns in image content analysis are labelled with corresponding categories. This category information is useful for supervised learning. Thus, a supervised learning mechanism is added to label the category of the cluster centres derived by the Fuzzy-ART.

2.1 SUMMARY OF LITERATURE SURVEY Table 2.1 List of methodologies for image processing SNO AUTHOR AND PUBLICATION 1 Dr.H.B.Kekre, Sudeep D. Thepade, Shrikant P. Sanas, Improved CBIR using Multileveled Block Truncation Coding, (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 07, 2010, 2471-2476 2 Lixin Duan, Wen Li, Ivor Wai-Hung Tsang and Dong Xu, Improving Web Image Search by Bag-Based Reranking, IEEE Transactions On Image Processing, Vol. 20, No. 11, November 2011. 3 H.B.Kekre, Dhirendra Mishra, Dct-Dst Plane sectorization of Row wise Transformed color Images in CBIR, International Journal of Engineering Science Euclidean distance DCT transformation of the image which gives better result Computational complexity Generalized Multi-Instance learning Usage of automatic bag annotation Need for more effective clustering methods METHODS USED Block Truncation Coding Uses three levels of block truncation Performance is dependent on precision and recall METRICS DEMERIT

and Technology Vol. 2 (12), 2010, 7234-7244 4 Pierre Blanchart, Marin Ferecatu and Mihai Datcu, Active Learning Using the Data Distribution for Interactive Image Classification and Retrieval, 978-1-42449927-4/11 2011 IEEE. 5 Yijuan Lu, Member, Lei Zhang, Jiemin Liu, and Qi Tian, Constructing concept lexica with small semantic gaps, IEEE Transactions on multimedia, vol. 12, no. 4, june 2010. 6 Chuan-Yu Chang, Hung-Jen Wang, RuHao Jian, Color-Based Semantic Image Retrieval With Fuzzy-Art, 2010 Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing. Fuzzy art Classifications are more close to human visual perception Learning is based on supervised machine learning LCSS, Nearest Neighbour Lowers the semantic gap, Mentions the concepts to minimize the gap Does not clearly mentions how many semantic concepts are necessary C-SVM algorithm Exploitation of the information contained in the unlabeled data. Generalization capability needs to be verified.

CHAPTER 3 SYSTEM ARCHITECTURE PHASE 1

Fig 3.1 content identification Fig 3.1 describes the sequence of processes involved in the content identification of the images. Here the content is first obtained then segmented and processed. The processed details are used for content identification.

PHASE 2
Fig 3.2 Content Based Image retrieval

IMAGE QUERY

FEATURE EXTRACTI ON

CONTENT QUERY

COLOR QUERY

BUSINESS LOGIC LAYER

QUERY PROCESSING

QUERY CONVERSION

QUERY

DATA ACCESS LAYER

MACHINE LEARNING

Pattern Matching

RETRIEVE MATCHING IMAGES

IMAGE DATA SET

PRESENTATION LAYER

Fig 3.2 describes the three tier architecture for content based image retrieval. The data access layer is the only layer which can communicate with the database and access data. This layer is laid over by the business logic layer which deals with the business validations and check for the errors in the data submitted by the user. The application layer is in direct connection with the user. This is where the user posts query and get backs results. This is the user interface of the system and this also provides some validations in the user interface level.

CHAPTER 4 MODULE DESCRIPTION 4.1. Query Conversion The query content is first determined. If the query is an image content then the features of the image are extracted and the extracted features are passed on to the next layer. Here the query can be made in the form of color and also content name. Here the query is formed based on the color that is given as query or the content that is given as query. The formed query is passed to the next layer for processing. Input: Output: Pseudocode Step1 Get the query from the user Step2 Identify the format of user query Step3 Convert the query into standard format User Query Converted Query

4.2. Feature Extraction If the user query is in the form of an image then the features of the image are extracted and the features that are extracted are passed as query to the later phases. Pseudocode Step1 Get the image Step2 Scan the pixels of the query image Step3 Extract the low level features of the image Step4 Pass the extracted features as query

4.3. Query Processing The query that is passed from the business logic layer is processed. In this module the query is processed and the parameters needed for the image retrieval are extracted from the query. Input: Output: Pseudocode Step1 Get the query Step2 Process the query Step3 Identify the parameters in the query Step4 Extract the parameters needed for image retrieval Step5 Pass the parameters as query 4.4. Pattern Matching The query that is obtained from the previous module is used and the image data set is checked for the matching patterns. This retrieves the matching patterns based on the threshold value that is entered by the user. This allows the user to have choice on the threshold value and thus the retrieved results. Pseudocode Step1 Get the query parameters Step2 Check for matching patterns in the image dataset Step3 Retrieve the matching patterns from the database Step4 Display the results Converted query Extracted parameters from the query

4.5. Presentation layer The results that are retrieved from the pattern matching module are collected into a list and are then sorted on the order of similarity. This is then displayed to the user along with the matching category and the similarity measure. Pseudocode Step1 Get the matching results Step2 Add them to list buffer Step3 Re-order the results based on category and matching parameters Step4 Display the result

CHAPTER 5 IMPLEMENTATION 4.1 HARDWARE REQUIREMENTS Processor Display - Pentium IV above

RAM - 128MB & above - 640 X 48 ( SVGA) Multimedia support Keyboard, Mouse

4.2 SOFTWARE REQUIREMENTS Platform Language - WINDOWS 7 - Microsoft Visual studio 2010

TOOL DESCRIPTION Microsoft .NET is a software development component which provides tools and libraries to assist software developers to create Windows-based applications easier, faster and more robust. Often referred to as .NET framework, this tool is used primarily for web applications. .NET technology provides the ability to quickly build, deploy, manage, and use connected, security-enhanced solutions with Web services. .NET-connected solutions enable businesses to integrate their systems more rapidly and in a more agile manner and help them realize the promise of information anytime, anywhere, on any device. .NET manages the scripting limitations of COM and DCOM and makes component development an easy task. Two major benefits of .NET include Side-by-side execution of code Decentralized registration of components

.NET developers are not required to maintain backward compatibility because different applications can use different versions of a shared component. .NET provides developers with an integrated set of tools for building Web services quickly and cost-effectively. Developers can use these tools to create scalable solutions that can work across different computing devices. .NET also reduces problems that occur because of centralized registration of components in the Registry. It does not use the Registry for component registration. Instead, it stores information about the components with the code and retrieves this information directly from the files at runtime. .NET provides an integrated, mobile computing experience to individual users. Data can be integrated from a range of computing hardware, such as laptops, Pocket PCs, Smartphones, and other devices. This enables users to access information easily regardless of their location. .NET applications are not dependent on the Registry. Therefore, it is easy to remove or replicate them. To remove or replicate the applications, users simply need to delete the files or copy over them. Some significant features of .NET technologies are, The main advantage of using the .Net technology is the reduced amount of code

necessary for building applications. Better performance is provided by using the just-in-time compilation, early binding, caching services and native optimization. Web pages created with the .NET technology perform common tasks such as form

submission and client authorization much easier. The .Net technology is language independent, which lets users choose the language

that will suit best for their applications. A web application created through the .NET technology is considered reliable as the

web server controls the pages on an ongoing basis. If it detects any infinite looping, memory leakage and any kind of abnormal and illegal activities, the server at once destroys them and restarts itself again. .NET provides the core technologies for developing Web services.

REFERENCE [1] Dr.H.B.Kekre Senior Professor, Mukesh Patel School of Tech. Mgmt. and Engg., SVKMs NMIMS University Mumbai, Sudeep D. Thepade Associate Professor & Ph.D. Research Scholar, Mukesh Patel school of Tech. Mgmt. and Engg., SVKMs NMIMS University Mumbai. Improved CBIR using Multileveled Block Truncation Coding, (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 07, 2010, 2471-2476 [2] Lixin Duan, Wen Li, Ivor Wai-Hung Tsang, and Dong Xu, Member, IEEE. Improving Web Image Search by Bag-Based Reranking, IEEE Transactions On Image Processing, vol. 20, NO. 11, November 2011. [3] H.B.Kekre Sr. Professor MPSTME, SVKMs NMIMS (Deemed-to be-University) Vile Parle West, Mumbai -56,INDIA, Dhirendra Mishra Associate Professor & PhD Research Scholar MPSTME, SVKMs NMIMS (Deemed-to be-University) Vile Parle West, Mumbai. DCT-DST Plane sectorization of Row wise Transformed color Images in CBIR,
International Journal of Engineering Science and Technology Vol. 2 (12), 2010, 7234-7244.

[4] Marin Ferecatu Conservatoire National des Arts et Metiers, Mihai Datcu, Active Learning Using the Data Distribution for Interactive Image Classification and Retrieval, DLR, IEEE 978-1-4244-9927-4/11 2011. [5] Yijuan Lu, Member, Lei Zhang, Jiemin Liu, and Qi Tian, Constructing Concept Lexica with Small Semantic Gaps, IEEE Transactions on Multimedia, VOL. 12, NO. 4, pp.18-41, JUNE 2010 [6] Chuan-Yu Chang, Hung-Jen Wang, Ru-Hao Jian, Color-Based Semantic Image Retrieval With Fuzzy-Art, Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp.217-242, 2010.

You might also like