You are on page 1of 8

www.andhracolleges.

com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 1


IV B.Tech I Semester Regular Examinations, November 2009
DATA WAREHOUSING AND DATA MINING
(Computer Science & Engineering)
Time: 3 hours Max Marks: 80

www.andhracolleges.com
Answer any FIVE Questions
All Questions carry equal marks
?????

1. (a) Explain data mining as a step in the process of knowledge discovery.


(b) Differentiate operational database systems and data warehousing. [8+8]

2. (a) Briefly discuss the forms of Data preprocessing with neat diagram.
(b) Briefly discuss the parametric and non- parametric methods of Numerosity
reduction. [8+8]

3. Explain the syntax for the following data mining primitives:

(a) Task-relevant data


(b) The kind of knowledge to be mined
(c) Interestingness measures
(d) Presentation and visualization of discovered patterns. [16]

4. (a) How can we perform attribute relevant analysis for concept description? Ex-
plain.

www.andhracolleges.com
(b) Briefly explain about the presentation of class comparison descriptions. [8+8]

5. Compare and contrast the differences between mining single dimensional Boolean
Association rules and multilevel Association rules for transactional databases. [16]

6. (a) Why naive Bayesian classification called “naive”? Briefly outline the major
ideas of naive Bayesian classification.
(b) Define regression. Briefly explain about linear, non-linear and multiple regres-
sions. [8+8]

7. (a) Use a diagram to illustrate how, for a constant MinPts value, density-based
clusters with respect to a higher density (i.e., a lower value for ε , the neigh-
borhood radius) are completely contained in density- connected sets obtained
with respect to a lower density.
(b) Give an example of how specific clustering methods may be integrated, for
example, where one clustering algorithm is used as a preprocessing step for
another. [8+8]

8. (a) Explain similarity search in multimedia data.


(b) Explain similarity search in time-series analysis.

Seminar Topics - Scholarships - Admission/Entrance Exam Notifications


USA-UK-Australia-Germany-France-NewZealand Universities List
1 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 1


(c) What is meant by authoritative web pages? Explain about mining the webs
link structures to identify authoritative web pages. [5+6+5]

?????

www.andhracolleges.com

www.andhracolleges.com
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
2 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 2


IV B.Tech I Semester Regular Examinations, November 2009
DATA WAREHOUSING AND DATA MINING
(Computer Science & Engineering)
Time: 3 hours Max Marks: 80

www.andhracolleges.com
Answer any FIVE Questions
All Questions carry equal marks
?????

1. (a) Explain the efficient computation of data cubes.


(b) Discuss the efficient processing of OLAP queries. [8+8]

2. Briefly discuss the Discretization and concept hierarchy techniques. [16]

3. Explain the syntax for the following data mining primitives:

(a) Task-relevant data


(b) The kind of knowledge to be mined
(c) Interestingness measures
(d) Presentation and visualization of discovered patterns. [16]

4. (a) Differentiate attribute generalization threshold control and generalized rela-


tion threshold control.
(b) Differentiate between predictive and descriptive data mining. [8+8]

www.andhracolleges.com
5. Propose a method for mining hybrid-dimension association rules (multidimensional
association rules with repeating predicates)and explain with an example. [16]

6. (a) Why naive Bayesian classification called “naive”? Briefly outline the major
ideas of naive Bayesian classification.
(b) Define regression. Briefly explain about linear, non-linear and multiple regres-
sions. [8+8]

7. The following table contains the attributes name, gender, trait-1, trait-2, trait-3,
and trait-4, where name is an object-id, gender is a symmetric attribute, and the
remaining trait attributes are asymmetric, describing personal traits of individuals
who desire a penpal. Suppose that a service exists that attempt to find pairs of
compatible penpals.

Name gender trair-1 trait-2 trait-3 trait-4


Kevan M N P P N
Caroline F N P P N
Erilk M P N N P
. . . . . .
. . . . . .
. . . . . .
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
1 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 2


For asymmetric attribute values, let the value P be set to 1 and the value N be set
to 0. Suppose that the distance between objects (potential penpals) is computed
based only on the asymmetric variables.

(a) Show the contingency matrix for each pair given Kevan, Caroline, and Erik.

www.andhracolleges.com
(b) Compute the simple matching coefficient for each pair.
(c) Compute the Jaccard coefficient for each pair.
(d) Who do you suggest would make the best pair of penpals? Which pair of
individuals would be the least compatible. [4+4+4+4]

8. (a) What is multimedia database? Explain mining multimedia databases.


(b) What is a time-series database? What is a sequence database? Explain mining
time-series and sequence data. [8+8]

?????

www.andhracolleges.com
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
2 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 3


IV B.Tech I Semester Regular Examinations, November 2009
DATA WAREHOUSING AND DATA MINING
(Computer Science & Engineering)
Time: 3 hours Max Marks: 80

www.andhracolleges.com
Answer any FIVE Questions
All Questions carry equal marks
?????

1. (a) Explain data mining as a step in the process of knowledge discovery.


(b) Differentiate operational database systems and data warehousing. [8+8]

2. Explain various data reduction techniques. [16]

3. The four major types of concept hierarchies are: schema hierarchies, set-grouping
hierarchies, operation-derived hierarchies, and rule-based hierarchies.

(a) Briefly define each type of hierarchy.


(b) For each hierarchy type, provide an example. [16]

4. (a) Differentiate attribute generalization threshold control and generalized rela-


tion threshold control.
(b) Differentiate between predictive and descriptive data mining. [8+8]

5. (a) Explain about constraint-based Association mining.


(b) Give an example for Association rule mining? Classify Association rules.[8+8]

www.andhracolleges.com
6. (a) Given a decision tree, you have the option of (i) converting the decision tree
to rules and then pruning the resulting rules, or (ii) pruning the decision tree
and then converting the pruned tree to rules. What advantages does former
option have over later one. Explain.
(b) Can any ideas from association rule mining be applied to classification? Ex-
plain.

7. Explain the following:

(a) DBSCAN
[8+8]

[4+4+4+4]

(b) OPTICS
(c) DENCLUE
(d) BIRCH.

8. (a) What is spatial data warehouse? What are the different types of dimensions
in a spatial data cube? What are the different types of measures in a spatial
data cube?
(b) What is keyboard-based association analysis? How can automated document
classification be performed?
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
1 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 3


(c) Briefly discuss about mining the World Wide Web. [2+2+2+2+2+6]

?????

www.andhracolleges.com

www.andhracolleges.com
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
2 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 4


IV B.Tech I Semester Regular Examinations, November 2009
DATA WAREHOUSING AND DATA MINING
(Computer Science & Engineering)
Time: 3 hours Max Marks: 80

www.andhracolleges.com
Answer any FIVE Questions
All Questions carry equal marks
?????

1. (a) Describe three challenges to data mining regarding data mining methodology
and user interaction issues.
(b) Explain Indexing OLAP data. [8+8]

2. Explain various data reduction techniques. [16]

3. (a) Discuss the various forms of visualizing the discovered patterns.


(b) Discuss about the task-relevant data specification. [8+8]

4. Suppose that the data for analysis include the attribute age. The age values for
the data tuples are (in increasing order):
13,15,16,16,19,20,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46,52,70.

(a) What is the mean of the data?


(b) What is the median?
(c) What is the mode of the data? Comment on the data’s modality.

www.andhracolleges.com
(d) What is the mid range of the data?
(e) Can you find (roughly) the first quartile(Q1),and third quartile(Q3) of the
data?
(f) Give the five number summaries of the data.
(g) Show a box plot of the data.
(h) How is the quantile-quantile plot different from a quantile plot? [16]

5. Sequential patterns can be mined in methods similar to the mining of association


rules. Design an efficient algorithm to mine multilevel sequential patterns from
a transaction database. An example of such a pattern is the following “A customer
who buys a PC will buy Microsoft software within three months”, on which one
may drill down to find a more refined version of the patterns, such as “A customer
who buys a Pentium PC will buy Microsoft office within three months”. [16]

6. (a) What is classification? What is prediction? Describe issues regarding classifi-


cation and prediction.
(b) Explain Bayesian belief networks. How does a Bayesian belief network train?
[8+8]

7. (a) Write algorithms for k-Means and k-Medoids. Explain.


Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
1 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information
www.andhracolleges.com The Complete Information About Colleges in Andhra Pradesh

Code No: M0502 Set No. 4


(b) Discuss about density-based methods. [8+8]

8. (a) Explain the classification and prediction analysis of multimedia data.


(b) What are basic measures for text retrieval? What methods are there for

www.andhracolleges.com
information retrieval?
(c) What is meant by ‘authoritative’ Web pages? Explain about mining the Web’s
link structures to identify authoritative web page. [4+6+6]

?????

www.andhracolleges.com
Seminar Topics - Scholarships - Admission/Entrance Exam Notifications
USA-UK-Australia-Germany-France-NewZealand Universities List
2 of 2
www.andhracolleges.com Engineering-MBA-MCA-Medical-Pharmacy-B.Ed-Law Colleges Information

You might also like