Professional Documents
Culture Documents
Abstract Objects
Attributes
Attributes
Attributes
Attributes Objects a a1 Attributes
a
a2 a a
a 3 a 4 a 5
a
Objects 1
a2 29 a3 30 a4 413 a5 535
Extraction of knowledge from data and using it for decision Objects a1 a2 a3 a4 a5 Objects o1 a a1 10
o1 1 10 a2 9 a3 0 a4 13 a5 35
o1 10 9 0 13 35 -6
o1 10 9 0 13 35 o1 o o2 50
10 1 15 9 2 22 0 7 71913 5 584
35 -6
making is vital in various real-world problems, particularly o2 1 2 7 5 -6
o2 2
o 2 o o3 0
1
1 90909
2
2 3 34
7
7 8 81
5
5 6 6-2
-6
-6 8 8
in the financial domain. We identify several financial prob- o3 90 3 8 6 8
o3 3o 90 -3 3 3 8 9 6 5 8 2
o3 o 4 1090 -3 9 3 3 4 8 9 2 6 5 6 8 2
o4 4 -3 3 9 -10 5 -16 2 13
o4 o o5 -3
3 4 48 3 56564
lems, which require the mining of actionable subspaces de- o4 -3 3 9 5 2
o5 5 4
9 -10
56 -10 -16 13
5 -16 9
3 2 13 Time l
o5 4 56 -10 -16 13 o5 4
6 1
56 1
-10 -16 21
-35 13 …
fined by objects and attributes over a sequence of time. These Time 1 Time 2 …
subspaces are actionable in the sense that they have the abil-
(a) Objects o2 , o3 , o4 and attributes a2 , a3 , a4 form a 3D sub-
ity to suggest profitable action for the decision-makers. We space cluster.
propose to mine actionable subspace clusters from sequen-
tial data, which are subspaces with high and correlated util-
ities. To efficiently mine them, we propose a framework
MASC (Mining Actionable Subspace Clusters), which is a o1
hybrid of numerical optimization, principal component anal- o2
Utility
o3
ysis and frequent itemset mining. We conduct a wide range
o4
of experiments to demonstrate the actionability of the clus- o5
ters and the robustness of our framework MASC. We show
that our clustering results are not sensitive to the framework
parameters and full recovery of embedded clusters in syn- Time
thetic data is possible. In our case-study, we show that clus-
ters with higher utilities correspond to higher actionability, (b) Objects o2 , o3 , o4 have high utility and they are
and we are able to use our clusters to perform better than one correlated across time.
of the most famous value investment strategies.
Figure 1: An example of an actionable subspace cluster.
1 Introduction
Clustering aims to find groups of similar objects and due to
its usefulness, it is popular in a large variety of domains,
such as astronomy, physics, geology, marketing, etc. Over domains also potentially change over time. In such sequen-
the years, data gathering has become more effective and tial databases, finding subspace clusters per timestamp may
easier, resulting in many of these domains having high produce a lot of spurious and arbitrary clusters, hence it is de-
dimensional databases. As a consequence, the distance sirable to find clusters that persist in the database over some
(difference) between any two objects becomes similar in given period.
high dimensional data, thus diluting the meaning of cluster Moreover, the usefulness of these clusters, and in gen-
[5]. One way to handle this issue is by clustering in eral of any mined patterns, lies in their ability to suggest
subspaces of the dimension space, so that objects in a group concrete and useful actions. Such patterns are called ac-
need only be similar on some subset of attributes (subspace), tionable patterns [19], and they are normally associated
instead of being similar across the entire set of attributes with the amount of profit that their suggested actions bring
(full-space) [20]. [19, 30, 31].
Besides being high-dimensional, the databases in these In this paper, we identify real-world problems, partic-
ularly in the financial world, which motivates the need to
infuse subspace clustering with actionability, for example:
∗ Institute for Infocomm Research, A*STAR, Singapore. Email:
shsim@i2r.a-star.edu.sg Example 1 Value investors scrutinize fundamentals or
† School of Computer Engineering, Nanyang Technological University,
Singapore. Email: ardi0002@ntu.edu.sg financial ratios of companies, in belief that they are crucial
‡ School of Computer Engineering, Nanyang Technological University, indicators of their future stock price movements [7, 13, 14].
Singapore. Email: asvivek@ntu.edu.sg Although experts like Benjamin Graham [13] have suggested
where N eigha (o) is the set of k-nearest neighbors of object We do not set thresholds to explicitly define the good-
o on attribute a. The k-nearest neighbors is obtained by using ness of the objects’ similarity and correlation. Instead, our
Equation 3.2. framework forms an objective function to measure their
The k-nearest neighbors heuristic adapts the width of goodness, and find clusters that maximize this objective
the Gaussian function accordingly to the distribution of the function. On defining the goodness of high utility, we shall
objects projected in the data space of attribute a, thus it explain it in details in Section 4.1.
is more robust than setting σ to a constant value. In our To remove redundancies in the clusters, we only mine
experiments, we set the default value of k as 10, and show maximal actionable subspace clusters. An actionable cluster
that this default setting works well in practice. Moreover, the (O × A) is maximal if there is no other actionable subspace
results are not sensitive to various values of k. cluster (O′ × A′ ) such that O ⊆ O′ and A ⊆ A′ . As we
Next, we define the quality of a cluster. Let ut (o) be always mine clusters that are maximal, for brevity, we simply
the utility of object o at time t. We assume that the utility denote them as actionable subspace clusters.
of an object measures the quality of the object; the higher Having all aforementioned notations, we can formally
the utility, the higher the quality of the object. The utility of define the actionable subspace cluster mining problem as:
object o over time is denoted as util(o). In this paper, we D EFINITION 3.2. [ACTIONABLE S UBSPACE C LUSTERS
simply use util(o) as the average utility, given as: M INING P ROBLEM ]. Given a database D, we find all ac-
1 X t tionable subspace clusters (O × A).
(3.4) util(o) = u (o).
|T |
t∈T
4 MASC Algorithm
Note that our framework can also be adapted to other utility 4.1 Framework Our framework is illustrated in Figure 2,
function, such as compound annual growth rate (CAGR). which consists of two modules.
We also require all objects in a cluster to behave simi-
larly across time. The correlation between two objects are 1. Projection into Standard Relational Database. The
measured using the statistical correlation measure, which is actionable and sequential database is projected into a
the proportion between their covariance and the individual standard relational database, based on a chosen cluster
standard deviation. The standard deviation of the utility of center c. Note that the projection is per centroid, i.e.,
object o is calculated as: we will have one relational database for each cluster
s center. In our experiment, we choose the centroids to
1 X t
(3.5) σ(o) = (u (o) − u(o))2 be objects with utility higher than a parameter utilmin .
|T |
t∈T In practice, the domain expert might also select the
and the covariance between two objects o1 , o2 is calculated centroids based on their internal knowledge.
as: The projection is done by setting up an objective func-
(3.6) tion that incorporates the utility, similarity, and cor-
1 X t relation of objects. We show that this function can
cov(o1 , o2 ) = (u (o1 ) − u(o1 ))(ut (o2 ) − u(o2 )),
|T | be modeled as an augmented Lagrangian equation,
t∈T
5 Experimentation Results ∗ ∗ ∗ ∗ ∗
max{|O ∩ O| + |A ∩ A||C = O × A , C =
We evaluated three main aspects of our approach using syn- O × A, C ∈ C}. recoverability measures the ability
thetic datasets: (1) cluster quality (including a comparison of C to recover C ∗ .
with TRICLUSTER [34], LCM-nCluster [23] and STATPC P s(C)
• spuriousness S = C∈C |O|+|A| , where s(C) = |O| +
[24]), (2) parameter sensitivity, and (3) efficiency and scal-
ability. A synthetic dataset D contains 1000 objects, each |A| − max{|O ∩ O| + |A ∩ A||C ∗ = O∗ × A∗ , C =
∗ ∗
with 10 attributes across 10 time frames. The attribute val- O ×A, C ∗ ∈ C ∗ }. spuriousness measures how spurious
ues of the objects range from 0 to 1, and their utilities range C is.
from -1 to 1. In each dataset, we embedded a set of 10 ran- 2R(1−S)
• signif icance = R+(1−S) . signif icance is a measure
dom subspace clusters, each with 3-10 objects being similar to find the best trade-off between the recoverability and
in 2-9 attributes. By default, we set the utility of each object spuriousness of C. The higher the signif icance, the
(util(o)) and the correlation between each pair of objects o1 more similar is C to the embedded clusters C ∗ .
and o2 in each embedded cluster (ρ(o1 , o2 )) to be at least
0.5. To ensure the objects within a cluster are homogeneous, The other approaches do not consider utility in their
we also set the maximum difference between objects’ val- clustering, and in order to have a fair comparison, we al-
ues on every attribute of the subspace, denoted as dif f , to lowed the other approaches to use pre-processed D only con-
0.1. These values hold for all experiments, unless explicitly tains objects whose utility and correlation measures are at
changed. least those of the embedded clusters. For algorithm LCM-
We also performed an extensive case study on real stock nCluster, we varied its parameter setting δ from 0.1 to 0.9. δ
market data to show the actionability of the resultant clusters, controls the differences allowed in the values of a subspace
and compare them with the criteria provided by Graham. cluster. One surprising result is that algorithm TRICLUS-
Our case study also shows the practical usage of cluster TER did not produce any clusters in all our experimental
definitions for value investors. settings. It either could not mine any 3D subspace clusters,
All approaches were coded in C++, and code for com- or the mining could not be completed within 6 hours, which
peting approaches was kindly provided by their respective may be due to scalability issue on the dataset. For algorithm
authors. The experiments were performed on a Windows STATPC, we used its recommended statistical significance
Vista environment, using a Intel Core2 Dual 2.6 GHz CPU level settings.
with 4GB RAM, except those involving algorithm TRI-
CLUSTER, which were performed on a server with Linux 1 Algorithm TRICLUSTER can only be run on Linux environment
1 1 1
MASC
0.9 MASC 0.9 nCluster δ = 0.1 0.9
0.8 nCluster δ = 0.1 0.8 δ = 0.5
δ = 0.5 δ = 0.9 0.8 MASC k=5
0.7 δ = 0.9 0.7 k=10
significance
significance
significance
STATPC k=15
0.6 0.6 0.7
nCluster δ = 0.1
0.5 0.5 0.6 δ = 0.5
δ = 0.9
0.4 0.4 STATPC
0.5
0.3 0.3
0.2 0.2 0.4
0.1 0.1 0.3
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Avg utilities of clusters Avg correlation coefficient of clusters diff
Figure 3: The significance of the actionable subspace clusters mined by different algorithms.
5.1.1 Varying utility util(o) We varied the utility util(o) values, which is not common in real-world datasets. On the
of the embedded clusters and the results are presented in contrary, MASC can still maintain the quality of the clusters
Figure 3(a). The significance of the clusters mined by even as dif f increases.
MASC are close to 1, particularly when the utility is high
(≥ 0.3). The significance drops when utility decreases, as we 5.2 Sensitivity Analysis. In this section, we evaluated
found the cluster to be contaminated by randomly generated how sensitive the parameters of algorithm MASC are in min-
objects. However, these random objects also have similar ing actionable subspace clusters. Similar to the previous sec-
utility and values with the embedded clusters, so it makes tion experimental setup, we embedded 10 actionable sub-
sense to combine them with the cluster. On the other hand, space clusters in a synthetic dataset D for each experiment
the significance of the clusters mined by algorithm LCM- in this section.
nCluster is quite low, and all clusters found by STATPC have
zero significance. Hence, we do not present the results of 5.2.1 Varying λ, δ We investigated the sensitivity of pa-
STATPC in Figure 3(a). rameters λ, δ by fixing τ = 1, µ = 10 and varying λ, δ across
a wide range. τ, µ are fixed at values which is recommended
5.1.2 Varying correlation coefficients ρ(o1 , o2 ) We var- in [26]. Algorithm MASC is able to perfectly mine the em-
ied the correlation coefficients ρ(o1 , o2 ) among objects in bedded clusters in D (signif icance = 1) in a wide range
each embedded cluster and the results are presented in Fig- of parameter settings, as shown in Figure 4(a). The signifi-
ure 3(b). We can see that the significance of clusters mined cance of the results drop only when extreme values are used
by algorithm MASC is also much higher than those by the on the parameters, that is when λ ≥ 10 and δ ≥ 0.1. Thus,
other approaches. algorithm MASC is insensitive to the parameters λ, δ.
5.1.3 Varying dif f We varied dif f and the results are 5.2.2 Varying τ, µ We then investigated the sensitivity of
presented in Figure 3(c). The significance of clusters mined parameters τ, µ by fixing δ = 0.001, µ = 0.1 and varying
by MASC is still much higher than those mined by other ap- τ, µ. The values of δ and µ are chosen at 0.001 and 0.1
proaches. When dif f = 0, the significance of the clusters respectively as the region of values around them give high
mined by LCM-nCluster reaches 0.78, but it drops dramat- significance for their results and this setting also gives a
ically as dif f increases. This shows that LCM-nCluster is relatively faster running time, which we shall explain in
only suitable to mine clusters that contain objects with exact Section 5.3. Algorithm MASC is still able to perfectly mine
0.8
significance
0.8 0.95
0.6 0.6
0.9
significance
0.4 0.4
0.85
0.2 0.2
0 0 0.8
0.1 0.0001 0.0001 100
0.01 0.001 0.001 10
0.001 0.01 0.01 1 0.75
0.0001 0.1 0.1 0.1
0.00001 1 1 0.01 MASC
0.000001 10 0.7
τ µ
10 0.001
0.0000001 100 5 10 15 20 25 30 35 40 45 50
λ
100 0.0001
δ k
(a) Fix τ = 1, µ = 10, vary δ, λ (b) Fix δ = 0.001, λ = 0.1, vary τ, µ (c) Varying k
Figure 4: The significance of the actionable subspace clusters mined by algorithm MASC across varying parameter settings.
the embedded clusters in D in a wide range of parameter an attribute and a time frame. We then randomly chose
settings, as shown in Figure 4(b). τ is insensitive to any 10 centroids from D and averaged the running time of
range of values, and µ ≥ 1 is recommended to get results computing P of each centroid. Figure 5(a) presents the
with high significance. Hence, our observations are in average running time used in computing P , which is a
accordance with the author’s recommended settings of τ = polynomial function of the number of objects. Furthermore,
1, µ = 10 [26]. it takes less than a minute to compute P of size 10,000,
which is feasible for real-world applications.
5.2.3 Varying k We checked the sensitivity of parameter
k and used the default settings for the other parameters. 5.3.2 Varying δ, λ We investigated how the parameter set-
Figure 4(c) presents the results, which shows that for all tings affect the running time of MASC. For this experiment,
k ≥ 5, the significance of the results is close to 1. Thus, we used the same experimental setup as Section 5.3.1. Figure
we demonstrated the robustness of the k-nearest neighbors 5(b) presents the running time of varying the tolerance level
heuristic. δ. As δ decreases, the convergence of solution is slower,
From these experiment results, we can see that problems which leads to the expected increase of running time.
in mining the embedded actionable subspace clusters only Figure 5(c) presents the running time of varying λ,
exist when we used extreme values for these parameters. which shows that the running time increases as λ increases.
Therefore, users can opt to use the default settings of these The running time is fastest when λ = 1, which implies
parameters. that 1 is a close estimate of the actual λ. At other settings,
the running time is slower as MASC has to iterate more to
5.3 Efficiency Analysis In this section, we evaluated the achieve the correct λ. We set λ = 1 as the default, since
efficiency of algorithm MASC in mining actionable sub- it is also shown in Section 5.2.1 that the significance of the
space clusters. We investigated two main areas that affect results is high at λ = 1. We do not show the running time
the running time of algorithm MASC, which are: (a) the of varying τ, µ since their default settings are recommended
size of dataset D and (b) the parameter settings. On the size in [26].
of D, we argue that only the number of objects needs to be
evaluated. The main computation costs of MASC lies in the 5.4 Case Study: Using Actionable Subspace Clusters in
computation of the optimal probability distribution P , and the Stock Market In value investment, investors scrutinize
the number of objects directly affects this computation. The the financial ratios of stocks to find stocks that potentially
number of time frames does not affect the running time of generate high profits. One of the most famous value in-
MASC, as the objects’ values and utilities are averaged over vestment strategy is formulated by the founder of value in-
time in MASC. Whereas for the number of attributes, it is a vestment, Benjamin Graham, which was proven to be prof-
constant factor to the running time of computing P , as we itable [27]. This strategy consists of a buy phase and a sell
compute P for every attribute. phase. The buy phase consists of ten criteria on the financial
ratios and price of stocks, with five ‘reward’ criteria and five
5.3.1 Varying Number of Objects We varied the number ‘risk’ criteria. If a stock satisfies at least one of the ‘reward’
of objects from 1,000 to 10,000 in a synthetic dataset, with criteria and one of the ‘risk’ criteria, we buy this stock. This
time (seconds)
time (seconds)
25 2 2
20
1.5 1.5
15
10 1 1
5 0.5 0.5
0
1 2 3 4 5 6 7 8 9 10 0 0
10-7 10-6 10-5 10-4 10-3 10-2 10-1 10-3 10-2 10-1 100 101 102 103 104
number of objects (x 103)
δ λ
Figure 5: The running time of algorithm MASC across varying datasets and parameter settings.
set of ten criteria is presented in [27]. In the sell phase, a actionable subspace clusters. We assume utility util(o) as
stock is sold if its price appreciates to 50% within two years. the annual price return of the stock o and we use the average
If not, then it is sold after two years. utility over time utilavg (o) to measure the price return of
stocks over time. From the set of actionable subspace
5.4.1 Comparing Different Value Investment Strategies clusters mined from each buy phase from year 1991 to 2000,
The set of 10 criteria in the buy phase is formulated based we bought the stocks that are in them. Figure 6(a) shows
on the Graham’s 40 years of research in the stock market. In the average returns of the stocks bought from year 1991 to
this case-study, we attempt to replace this human perception- 2000, and we varied utilmin from 0.3 to 1, and used the
based buy phase with actionable subspace clustering-based default settings for the parameters of the algorithm MASC.
buy phase. As described in Example 1, using actionable The results for settings utilmin > 0.7 are not shown as
subspace clusters has two-fold effect. Firstly, stocks that no stocks could be clustered for certain years. The average
are clustered have historical high returns (utilities), so they returns across the ten years is shown in Figure 6(b). We can
are potential good buys. Secondly, the subspace clusters of see that at certain utilmin settings, the average returns across
financial ratios can be studied to understand why they lead the ten years are significantly much better than Graham’s
to stocks with high returns. investment strategy, especially when utilmin = 0.7, the
In this experiment, we downloaded financial figures average return across the ten years is 29.23%. A possible
and price data of all North American stocks from year explanation is Graham’s investment strategy is formulated
1980 to 2000. This 20 years of data was downloaded in 1977 and his ten criteria are not adapted to the evolving
from Compustat [11]. We converted these financial figures dynamics of the stock market. On the other hand, actionable
into 30 financial ratios, based on the ratios’ formula from subspace cluster is able to capture this evolving dynamics
Investopedia [16] and Graham’s ten criteria. We removed between stock prices and financial ratios to cluster profitable
penny stocks (whose prices are less than USD$5) from the stocks.
data as there is a high risk that these stocks are manipulative
and their financial figures are less transparent [29]. Thus, 5.4.2 Effectiveness of Actionable In this experiment, we
we assume that the investor exercises due caution and only investigated if high utilities do correspond to the concept of
concentrates on big companies’ stocks. After pruning the actionable. From Figure 6(b), we can see that the average
penny stocks, we have 1762 to 2071 stocks for each year. returns across the ten years follows in increasing trend as
For Graham’s investment strategy, the buy phase was utilmin increases. As mentioned in the previous section,
conducted over a ten year period starting from year 1991 to the results for settings utilmin > 0.7 are not shown as
2000. The buy phase starts on 1991 as Graham’s ten criteria no stocks could be clustered for certain years. Recall
requires the analysis of the past ten year window frame of that utilmin is the minimum average return required by
the stocks’ historical financial ratios and price data. Figure the stocks to be chosen as centroids for mining actionable
6(a) shows the average returns of the stocks bought using subspace clusters, so the possibility of stocks having higher
Graham’s ten criteria, from year 1991 to 2000. The average average returns being clustered is higher. Since Figure 6(b)
returns across the ten years is 12.84%, which is a substantial shows that using actionable subspace clusters with higher
amount of profit. utilities generates higher profits, we have shown that higher
To have a fair comparison, for each buy phase from year utilities correlates to higher actionability. Figure 6(c) shows
1991 to 2000, we also used the ten year window frame of that as utilmin increases, the average number of stocks in a
the stocks’ historical financial ratios and price data to mine cluster decreases. For an investor who wants to diversify his
0.4 0.25
Number of stocks
1 0.5 60
0.6 0.2
50
0.7
0.15 40
0.5
30
0.1
20
0 0.05 MASC 10
Graham MASC
0 0
-0.5 0.3 0.4 0.5 0.6 0.7 0.3 0.4 0.5 0.6 0.7
1992 1994 1996 1998 2000 2002 utilmin utilmin
Years
(a) Average returns of stocks from 1991 to 2000 (b) Average returns of stocks for different utilities (c) Average number of stocks for different utilities
[6] S. Bochkanov and V. Bystritsky. ALGLIB 2.0.1 of locally-tuned processing units. Neural Computation,
L-BFGS algorithm for multivariate optimization. 1(2):281–294, 1989.
http://www.alglib.net/optimization/lbfgs.php [Last accessed [26] J. Nocedal and S. J. Wright. Numerical Optimization, pp.
2009]. 497–528. Springer, 2006.
[7] J. Y. Campbell and R. J. Shiller. Valuation ratios and the [27] H. R. Oppenheimer. A test of ben graham’s stock selection
long run stock market outlook: An update. In Advances in criteria. Financial Analysts Journal, 40(5):68–74, 1984.
Behavioral Finance II. Princeton University Press, 2005. [28] T. Uno, M. Kiyomi, and H. Arimura. LCM ver. 2: Efficient
[8] L. Cerf, J. Besson, C. Robardet, and J.-F. Boulicaut. Data mining algorithms for frequent/closed/maximal itemsets. In
peeler: Contraint-based closed pattern mining in n-ary rela- FIMI, 2004.
tions. In SDM, pp. 37–48, 2008. [29] U.S. Securities and Exchange Commis-
[9] H. Cheng, K. A. Hua, and K. Vu. Constrained locally sion. Microcap stock: A guide for investors.
weighted clustering. PVLDB, 1(1):90–101, 2008. http://www.sec.gov/investor/pubs/microcapstock.htm [Last
[10] Y. Cheng and G. M. Church. Biclustering of expression data. accessed 2009].
In ISMB, pp. 93–103, 2000. [30] K. Wang, S. Zhou, and J. Han. Profit mining: From patterns
[11] Compustat. http://www.compustat.com [Last accessed 2009]. to actions. In EDBT, pp. 70–87, 2002.
[12] M. Feldstein and J. Green. Why do companies pay dividends? [31] K. Wang, S. Zhou, Q. Yang, and J. M. S. Yeung. Mining
The American Economic Review, 73(1):17–30, 1983. customer value: From association rules to direct marketing.
[13] B. Graham. The Intelligent Investor: A Book of Practical Data Mining and Knowledge Discovery, 11(1):57–79, 2005.
Counsel. Harper Collins Publishers, 1986. [32] X. Xu, Y. Lu, K.-L. Tan, and A. K. H. Tung. Finding time-
[14] B. Graham and D. Dodd. Security Analysis. McGraw-Hill lagged 3D clusters. In ICDE, pp. 445–456, 2009.
Professional, 1934. [33] K. Y. Yip, D. W. Cheung, and M. K. Ng. On discovery
[15] R. Gupta, G. Fang, B. Field, M. Steinbach, and V. Kumar. of extremely low-dimensional clusters using semi-supervised
Quantitative evaluation of approximate frequent pattern min- projected clustering. In ICDE, pp. 329–340, 2005.
ing algorithms. In KDD, pp. 301–309, 2008. [34] L. Zhao and M. J. Zaki. TRICLUSTER: an effective algo-
[16] Investopedia. http://www.investopedia.com/university/ratios/ rithm for mining coherent clusters in 3D microarray data. In
[Last accessed 2009]. SIGMOD, pp. 694–705, 2005.
[17] L. Ji, K.-L. Tan, and A. K. H. Tung. Mining frequent closed
cubes in 3D datasets. In VLDB, pp. 811–822, 2006.
[18] I. T. Jolliffe. Principal Component Analysis, pp. 115–118. Appendix
Springer, 2002. ∂F (P )
[19] J. Kleinberg, C. Papadimitriou, and P. Raghavan. A microe- The complete expression of ∂poi on attribute a
conomic view of data mining. Data Mining and Knowledge
∂F (P )
Discovery, 2(4):311–324, 1998. ∂poi = −poi s(oi , c)uo ρoi ,c
[20] H.-P. Kriegel, P. Kröger, and A. Zimek. Clustering high- X
dimensional data: A survey on subspace clustering, pattern- · po po′ s(o, c)s(o′ , c)ρo,o′
based clustering, and correlation clustering. ACM Transac- o,o′ ∈O×O|o6=o′
X
tions on Knowledge Discovery from Data, 3(1):1–58, 2009. −2 po s(o, c)uo ρo,c
[21] P. Kröger, H.-P. Kriegel, and K. Kailing. Density-connected o∈O
X
subspace clustering for high-dimensional data. In SDM, pp. · po po′ s(o, c)s(o′ , c)ρo,o′
246–257, 2004. o′ ∈O|o6=o′
[22] J. W. Lewellen. Predicting returns with financial ratios.
X
−λ + µ( po − 1)
Journal of Financial Economics, 74:209–235, 2004.
o∈O
[23] G. Liu, J. Li, K. Sim, and L. Wong. Distance based subspace
clustering with flexible dimension partitioning. In ICDE, pp.
1250–1254, 2007.
[24] G. Moise and J. Sander. Finding non-redundant, statistically
significant regions in high dimensional data: a novel approach
to projected and subspace clustering. In KDD, pp. 533–541,
2008.
[25] J. Moody and C. J. Darken. Fast learning in networks