Application of discriminant analysis and cross-validation on proteomics data

Julia Kuligowski, David Pérez-Guaita, Guillermo Quintás

Research output: Chapter in Book/Report/Conference proceedingChapter (Book)Otherpeer-review

6 Citations (Scopus)

Abstract

High-throughput proteomic experiments have raised the importance and complexity of bioinformatics analysis to extract useful information from raw data. Discriminant analysis is frequently used to identify differences among test groups of individuals or to describe combinations of discriminant variables. However, even in relatively large studies, the number of detected variables typically largely exceeds the number of samples and the classifiers should be thoroughly validated to assess their performance for new samples. Cross-validation is a widely approach when an external validation set is not available. In this chapter, different approaches for cross-validation are presented including relevant aspects that should be taken into account to avoid overly optimistic results and the assessment of the statistical significance of cross-validated figures of merit.

Original languageEnglish
Title of host publicationStatistical Analysis in Proteomics
EditorsKlaus Jung
Place of PublicationNew York NY USA
PublisherHumana Press
Chapter11
Pages175-184
Number of pages10
Volume1362
ISBN (Electronic)9781493931064
ISBN (Print)9781493931057
DOIs
Publication statusPublished - 2016

Publication series

NameMethods in Molecular Biology
Volume1362
ISSN (Print)1064-3745
ISSN (Electronic)1940-6029

Keywords

  • Cross-validation
  • Discriminant analysis
  • Double cross-validation
  • Partial least squares-discriminant analysis
  • Proteomics

Cite this