Professional Documents
Culture Documents
ABSTRACT
PROBLEM STATEMENT
SOLUTIONS EXIST
SOLUTION PROPOSED
GOAL OR OBJECTIVE
A four staged concept-based mining model should be
developed.
This research should target to prove that proposed
concept based mining model will improve the text
clustering quality.
By exploiting the semantic structure of the sentences
in documents, a better text clustering result should
be achieved.
APPROACH
1. Design and implement new Concept term frequency
measurement model.
2. Sentence-based concept analysis should be done
that helps to analyze the semantic structure of each
sentence.
3. This semantic structure of the sentence captured in
step 2 will be associated with conceptual term
frequency measure to capture the sentence concept.
4. Perform document-based concept analysis that
analyzes each concept at the document level using
the concept-based term frequency.
5. Perform analysis of concepts on the corpus level
using the document frequency global measure
6. Perform concept-based similarity measure that will
allow measuring the importance of each concept with
respect to the semantics of the sentence, the topic of
the document, and the discrimination among
documents in a corpus
Hardware Requirements
A CPU with CORE2duo, 2GB RAM and 80GB HDD
OS: Any OS
Software Requirement
OS should have JRE (Java Runtime Environment)
Language: JAVA SE
IDE: Net beans 6.5
Build Tool: APACHE ANT
Data Mining Tools: WEKA, RAPID MINER