You are on page 1of 3

Hannah Stoney

Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data-Bair E, Tibshirani R (2004) Semi-Supervised Methods to Predict Patient
Survival from Gene Expression Data. PLoS Biol 2(4): e108. doi:10.1371/journal.pbio.0020108

Introduction: Cancer patients all react differently to the same treatment. Tumors that appear to be similar are completely different at the molecular level. The researchers looked at B-cell lymphoma, which is the most common type of lymphoma in adults. There are several different ways to determine what subtypes of cancer cells a patient has. Hierarchical clustering has recently identified cancer subtypes successfully. This clustering is considered an unsupervised technique. Researchers can also determine subtype by looking at clinical data, or with a supervised technique. The research goal is to use gene expression data with a mix of clinical data to identify cancer subtypes. This is whats called semi-supervised techniques. The importance of finding what subtype a cancerous cell may be has life longevity implications. The goal of trying to find a better method to diagnose these subtypes is so that we can decide whether to be more aggressive in our treatment of the cancer, or if the milder treatments can be done and still have a high success rate. If the diagnostic can be as accurate as possible, perhaps the cancer can be treated in a timely and safe manner for the paitient, and hopefully increase the lifespan of cancer patients.

Materials and Methods: They took 88 patients with small, round, blue cell tumors for the testing. The testing included the nearest shrunken centroids procedure. This procedure is done by diagonal linear discriminant analysis, applied to surviving genes. The capture of median

Hannah Stoney

survival time label, and to calculate prediction of survival. The researchers also used the supervised clustering procedure. This procedure includes calculating cox scores, creating a list of most significant genes, and calculating the prediction of survival.

Results: To find a conclusion, the researchers looked at a dataset that consisted of 7399 genes from 240 separate patients. However, only 80 of those patients were actually used to observe the validation of the model established by the remaining 160 patients. There were different groups and subgroups used to establish the results. The patients were given designated survival times, and then into either a low-risk or high-risk state determined by the length of their designated survival times. Researchers found that conventional clustering techniques failed to identify subgroups in relation to survival times. Hierarchical clustering techniques did not show differences pertaining to survival time, and that supervised clustering provided highly significant data in relation to the procedures.

Discussion: If we can distinguish between subtypes of certain cancers, we can use classification methods to diagnose certain cancers in future patients. It is important to develop methods to identify subgroups, so that we really can identify what type of procedures and policies will apply to each subgroup. Unsupervised methods are popular, yet not always accurate when comparing to the clinical significance of each subgroup. The primary use of clinical data to diagnose subclasses is likely inaccurate. Supervised clustering methods are easier to interpret

Hannah Stoney

and useful prognostic tools. However, its important to keep in mind, and almost critical to realize that none of the mentioned methods are 100% accurate. A lot of information about the metastasis risk, and death, for a given patient can be essential in order to treat the cancer successfully. If the risk is high, the cancer must be attacked at an aggressive pace. Synonymously, if the risk is low, milder procedures can accomplish much success and a softer pace. Currently using DNA microarrays, researchers have been able to successfully identify subtypes of cancer to assess a patients risk profile. The results of this experiment show that perhaps a more semi-supervised learning method can identify those subtypes of cancer to predict patient survival better than current methods can. Perhaps these processes can be used as a powerful tool to diagnose, and in essence, treat cancer.

You might also like