Professional Documents
Culture Documents
Or
Explain different data repository on which data mining task
can be performed.
● Some database application require efficient data structure and scalable method for
handling complex object structures; variable length records; semi structured or
unstructured data;text;spatio temporal; and multimedia data; and database schema
with complex structures and dynamic changes.
● Due to this reason only advanced database system and specific application oriented
database system has been developed.
Data Mining
A regular data retrieval system not able to answer queries like, which items sold well
together?
● Market Basket analysis would enable you to bundle group of items together as a
strategy for maximizing sales.
● Printers are commonly purchased by together with computer; you could offer an
expensive model of printers at a discount to customers buying selected computers, in
the hopes of selling more of the expensive printers.
● A spatial Database that stores spatial objects that change with time is called
spatiotemporal database, from which interesting information can be mined.
● Able to group the trends of moving objects and identify some strangely moving
vehicles, or distinguish a bioterrorist attack from a normal outbreak of the flu based
on the geographic spread of a disease with time.
● Text databases are database that contain word description for objects. It is not a
simple keyword but long sentences or paragraphs such as product specification, error
or bug reports, warning msg, and summary report notes.
● What can data mine on text database uncover?
● Discover general and concise descriptions of the text documents, content
association, as well as clustering behavior of the text document.
● To do this standard data mining method need to be integrated with information
retrieval technique.
Multimedia database
Data streams
● Many applications involve the generation and analysis of a new kind of data, where
data flow and in and out of an observation platform dynamically.
● Features of data streams: huge or possibly infinite volume, dynamically changing,
flowing in and out in a fixed order, allowing only one or a small number of scans and
demanding fast response time.
● Data streams include various kinds of scientific and engineering data ,time series
data ,power supply, network traffic ,stock exchange, telecommunications video
surveillance and whether on environmental monitoring.
● Data streams are not stored in any kind of data repository. Because efficient mgt and
analysis of stream data is an challenging task.
● Data model for stream data is Continuous query model consists predefined queries
constantly evaluate incoming streams, collect aggregate data, report the current
status of the data streams and response to their changes.
● Efficient discovery of general patterns and dynamic changes within stream data.
● Example: To detect intrusion of a computer network based on the anomaly of
message flow, which may be discovered by clustering data streams, dynamic
construction of stream models, or comparing the current frequent patterns with that
at a certain previous time.
● Data mining can often provide additional help than the web services.
● Web page analysis based on linkages among pages can help rank web pages based
on their importance, influence, and topics.
● Automated web page clustering and classification help group and arrange web pages
in a multidimensional manner based on their content.
● Web community analysis help identify hidden web social network and communities
and observe their evaluation.
● A temporal database stores relational data that include time related attributes. These
attribute involve several timestamps, each having different semantics.
● DM tech used to find the characteristics of object evaluation or the trend of changes
for object in the database. Such information useful in decision making and strategy
planning.
● Example: Mining of bank data aid in the scheduling of bank tellers according to the
volume of customer traffic.
● A sequence database store sequences of ordered events with or without concrete
notion of time.(Customer shopping sequences)
● A time series database stores sequence of values or events obtained over repeated
measurements of time.(Hourly,daily,weekly)
● Example: data collected from the stock exchange, inventory control, and observation
of natural phenomena (like temperature and wind).
● DM tech used to find the characteristics of object evaluation or the trend of changes
for object in the database
● Stock exchange data mined to discover trends that could help you to plan investment
strategies.(When is the best time to purchase ALL ELECTRONICS STOCK? This
analysis is requiring defining multiple granularity of time.