reading

Cover
Foreword
Preface
Contributors
About the Companion Website
1 Biological Sequence Databases
1. Introduction
2. Nucleotide Sequence Databases
3. Nucleotide Sequence Flatfiles: A Dissection
4. Protein Sequence Databases
5. Summary
6. Acknowledgments
7. Internet Resources
8. Further Reading
9. References
2 Information Retrieval from Biological Databases
1. Introduction
2. Integrated Information Retrieval: The Entrez System
3. Medical Databases
4. Organismal Sequence Databases Beyond NCBI
5. Summary
6. Further Reading
7. References
3 Assessing Pairwise Sequence Similarity: BLAST and FASTA
1. Introduction
2. Global Versus Local Sequence Alignments
3. Scoring Matrices
4. BLAST
5. BLAST 2 Sequences
6. MegaBLAST
7. PSI-BLAST
8. BLAT
9. FASTA
10. Summary
11. Further Reading
12. References
4 Genome Browsers
1. Introduction
2. The UCSC Genome Browser
3. UCSC Table Browser
4. ENSEMBL Genome Browser
5. Ensembl Biomart
6. JBrowse
7. Summary
8. Further Reading
9. References
5 Genome Annotation
1. Introduction
2. Gene Prediction Methods
3. Ab Initio Gene Prediction in Prokaryotic Genomes
4. Ab Initio Gene Prediction in Eukaryotic Genomes
5. How Well Do Gene Predictors Work?
6. Assessing Prokaryotic Gene Predictors
7. Assessing Eukaryotic Gene Predictors
8. Evidence Generation for Genome Annotation
9. Gene Annotation and Evidence Generation using Comparative Gene Prediction
10. Genome Annotation Pipelines
11. Summary
12. Acknowledgments
13. Internet Resources
14. Further Reading
15. References
6 Predictive Methods Using RNA Sequences
1. Introduction
2. Overview of RNA Secondary Structure Prediction Using Thermodynamics
3. Dynamic Programming
4. Accuracy of RNA Secondary Structure Prediction
5. Predicting the Secondary Structure Common to Multiple RNA Sequences
6. Practical Introduction to Single-Sequence Methods
7. Practical Introduction to Multiple Sequence Methods
8. Other Computational Methods to Study RNA Structure
9. Comparison of Methods
10. Predicting RNA Tertiary Structure
11. Summary
12. Further Reading
13. References
7 Predictive Methods Using Protein Sequences
1. Introduction
2. One-Dimensional Prediction of Protein Structure
3. Predicting Protein Function
4. Summary
5. Further Reading
6. References
8 Multiple Sequence Alignments
1. Introduction
2. Measuring Multiple Alignment Quality
3. Making an Alignment: Practical Issues
4. Commonly Used Alignment Packages
5. Viewing a Multiple Alignment
6. Summary
7. References
9 Molecular Evolution and Phylogenetic Analysis
1. Introduction
2. Early Classification Schemes
3. Sequences As Molecular Clocks
4. Background Terminology and the Basics
5. How to Construct a Tree
6. Marker-Based Evolution Studies
7. Phylogenetic Analysis and Data Integration
8. Future Challenges
9. References
10 Expression Analysis
1. Introduction
2. Step 0: Choose an Expression Analysis Technology
3. Step 1: Design the Experiment
4. Step 2: Collect and Manage the Data – and Metadata
5. Step 3: Data Pre-Processing
6. Step 4: Quality Control
7. Step 5: Normalization and Batch Effects
8. Step 6: Exploratory Data Analysis
9. Step 7: Differential Expression Analysis
10. Step 8: Exploring Mechanisms Through Functional Enrichment Analysis
11. Step 9: Developing a Classifier
12. Single-Cell Sequencing
13. Summary
14. Further Reading
15. References
11 Proteomics and Protein Identification by Mass Spectrometry
1. Introduction
2. Mass Spectrometry
3. Tandem Mass Spectrometry for Peptide Identification
4. Sample Preparation
5. Bioinformatics Analysis for MS-based Proteomics
6. Proteomics Strategies
7. Peptide Mass Fingerprinting
8. PMF on the Web
9. Proteomics and Tandem MS
10. PSM Software
11. PSM on the Web
12. Reporting Standards
13. Proteomics Data Repositories
14. Protein/Proteomics Databases
15. Selected Applications of Proteomics
16. Summary
17. Acknowledgments
18. Internet Resources
19. Further Reading
20. References
12 Protein Structure Prediction and Analysis
1. Introduction to Protein Structures
2. How Protein Structures are Determined
3. How Protein Structures are Described
4. Protein Structure Databases
5. Visualizing Proteins
6. Protein Structure Prediction
7. Protein Structure Evaluation
8. Protein Structure Comparison
9. Summary
10. Further Reading
11. References
13 Biological Networks and Pathways
1. Introduction
2. Pathway and Molecular Interaction Mapping: Experiments and Predictions
3. Pathway and Molecular Interaction Databases: An Overview
4. Pathway Databases
5. Molecular Interaction Databases
6. Functional Interaction Databases
7. Strategies for Navigating Pathway and Interaction Databases
8. Standard Data Formats for Pathways and Molecular Interactions
9. Pathway Visualization and Analysis
10. Network Visualization and Analysis
11. Summary
12. Acknowledgments
13. Internet Resources
14. Further Reading
15. References
14 Metabolomics
1. Introduction
2. Data Formats
3. Databases
4. Bioinformatics for Metabolite Identification
5. Multivariate Statistics
6. Bioinformatics for Metabolite Interpretation
7. Summary
8. Further Reading
9. References
15 Population Genetics
1. Introduction
2. Evolutionary Processes and Genetic Variation
3. Allele Frequencies and Population Variation
4. Display Methods
5. Demographic History Inference
6. Admixture and Ancestry Estimation
7. Detection of Natural Selection
8. Other Applications
9. Summary
10. References
16 Metagenomics and Microbial Community Analysis
1. Introduction
2. Why Study the Microbiome?
3. The Origins of Microbiome Analysis
4. Metagenomic Workflow
5. General Considerations in Marker-Gene and Metagenomic Data Analysis
6. Marker Genes
7. Metagenomic Data Analysis
8. Other Techniques to Characterize the Microbiome
9. Summary
10. Further Reading
11. References
17 Translational Bioinformatics
1. Introduction
2. Databases Describing the Genetics of Human Health
3. Prediction and Characterization of Impactful Genetic Variants from Sequence
4. Computing with Patient Phenotype Using Data in Electronic Health Records
5. Informatics and Precision Medicine
6. Ethical, Legal, and Social Implications of Translational Medicine
7. Summary
8. References
18 Statistical Methods for Biologists
1. Introduction
2. Descriptive Representations of Data
3. Statistical Inference and Statistical Hypothesis Testing
4. Summary
5. Acknowledgments
6. Internet Resources
7. Further Reading
8. References
Appendices
1. 1.1 Example of a Flatfile Header in ENA Format
2. 1.2 Example of a Flatfile Header in DDBJ/GenBank Format
3. 1.3 Example of a Feature Table in ENA Format
4. 1.4 Example of a Feature Table in GenBank/DDBJ Format
5. 6.1 Dynamic Programming
6. Reference
Glossary
Index
End User License Agreement

List of Tables

Chapter 1
1. Table 1.1 Indicating locations within the feature table.
Chapter 2
1. Table 2.1 Entrez Boolean search statements.
Chapter 3
1. Table 3.1 Selecting an appropriate scoring matrix.
2. Table 3.2 BLAST algorithms.
3. Table 3.3 Main FASTA algorithms.
Chapter 7
1. Table 7.1 Disorder prediction performance.
2. Table 7.2 Performance of selected gene ontology term prediction methods in CAFA2...
Chapter 8
1. Table 8.1 Aligner performance on BAliBASE3 benchmark.
Chapter 9
1. Table 9.1 Some common software packages implementing different phylogenetic anal...
Chapter 11
1. Table 11.1 List of common sources of protein sequences (used in FASTA format).
2. Table 11.2 Standard search parameters used with sequence database search engines...
Chapter 12
1. Table 12.1 Relationship between backbone root mean square deviation (RMSD, i...
Chapter 14
1. Table 14.1 A list of freely available molecular editors and visualization tools.
2. Table 14.2 A list of open access chemical, spectral, pathway, and metabolomic da...
Chapter 15
1. Table 15.1 Examples of genes that have undergone natural selection in human popu...
Chapter 17
1. Table 17.1 Examples of commonly used biomedical ontologies and terminologies in ...
Chapter 18
1. Table 18.1 Common parametric statistical tests and their non-parametric equivale...

List of Illustrations

Chapter 1
1. Figure 1.1 The landing page for ENA record U54469.1, providing a graphical vie...
2. Figure 1.2 Results of a search for the human heterogeneous nuclear ribosomal p...
3. Figure 1.3 The Subcellular location and Pathology & Biotech sections of ...
4. Figure 1.4 The Feature viewer rendering of the record for the human heterogene...
5. Figure 1.5 Expanding the PTM, Structural features, and Variants sections withi...
Chapter 2
1. Figure 2.1 The exponential growth of GenBank in terms of number of nucleotides...
2. Figure 2.2 Results of a text-based Entrez query against PubMed using Boolean o...
3. Figure 2.3 An example of a PubMed record in Abstract format, as returned throu...
4. Figure 2.4 Neighbors to an entry found in PubMed. The original entry from Figu...
5. Figure 2.5 The Entrez Gene page for the DCC (deleted in colorectal carcinoma) ...
6. Figure 2.6 A section of the Database of Single Nucleotide Polymorphisms (dbSNP...
7. Figure 2.7 Entries in the RefSeq protein database corresponding to the origina...
8. Figure 2.8 The RefSeq entry for the netrin receptor, the protein product of th...
9. Figure 2.9 The same RefSeq entry for the netrin receptor shown in Figure 2.8, ...
10. Figure 2.10 Protein structures associated with the RefSeq entry for the human ...
11. Figure 2.11 The structure summary page for pdb:4URT, the crystal structure of ...
12. Figure 2.12 A list of structures deemed similar to pdb:4URT using VAST+. The t...
13. Figure 2.13 Online Mendelian Inheritance in Man (OMIM) entries related to the
14. Figure 2.14 The Online Mendelian Inheritance in Man (OMIM) entry for the DCC g...
15. Figure 2.15 An example of a list of allelic variants that can be found through...
16. Figure 2.16 The ClinicalTrials.gov page showing all actively recruiting clinic...
17. Figure 2.17 A clickable map showing where actively recruiting clinical trials ...
18. Figure 2.18 The Mouse Genome Informatics (MGI) entry for the Dcc gene in mouse...
19. Figure 2.19 The Zebrafish Information Network (ZFIN) gene page for the dcc gen...
20. Figure 2.20 An example of gene expression data available through the Zebrafish...
Chapter 3
1. Figure 3.1 The BLOSUM62 scoring matrix (Henikoff and Henikoff 1992). BLOSUM62 ...
2. Figure 3.2 A nucleotide scoring table. The scoring for the four nucleotide bas...
3. Figure 3.3 The initiation of a BLAST search. The search begins with query word...
4. Figure 3.4 BLAST search extension. Length of extension represents the number o...
5. Figure 3.5 The National Center for Biotechnology Information (NCBI) BLAST land...
6. Figure 3.6 The upper portion of the BLASTP query page. The first section in th...
7. Figure 3.7 The lower portion of the BLASTP query page, showing algorithm param...
8. Figure 3.8 Graphical display of BLASTP results. The query sequence is represen...
9. Figure 3.9 The BLASTP “hit list.” For each sequence found, the user is present...
10. Figure 3.10 Detailed information on a representative BLASTP hit. The header pr...
11. Figure 3.11 Performing a BLAST 2 Sequences alignment. Clicking the check box a...
12. Figure 3.12 Typical output from a BLAST 2 Sequences alignment, based on the qu...
13. Figure 3.13 Constructing a position-specific scoring matrix (PSSM). In the upp...
14. Figure 3.14 Performing a PSI-BLAST search. See text for details.
15. Figure 3.15 Selecting algorithm parameters for a PSI-BLAST search. See text fo...
16. Figure 3.16 Results of the first round of a PSI-BLAST search. For each sequenc...
17. Figure 3.17 Results of the second round of a PSI-BLAST search. New sequences i...
18. Figure 3.18 Submitting a BLAT query. A rat clone from the Cancer Genome Anatom...
19. Figure 3.19 Results of a BLAT query. Based on the query submitted in Figure 3....
20. Figure 3.20 The FASTA search strategy. (a) Once FASTA determines words of leng...
21. Figure 3.21 Search summary from a protein–protein FASTA search, using the sequ...
22. Figure 3.22 Hit list for the protein–protein FASTA search described in Figure ...
Chapter 4
1. Figure 4.1 The home page of the UCSC Genome Browser, showing a query for the g...
2. Figure 4.2 The default view of the UCSC Genome Browser, showing the genomic co...
3. Figure 4.3 The genomic context of the human HIF1A gene, after clicking on zoom...
4. Figure 4.4 The RefSeq Track Settings page. The track settings pages are used t...
5. Figure 4.5 The genomic context of the human HIF1A gene, after displaying RefSe...
6. Figure 4.6 The Get Genomic Sequence page that provides an interface for users ...
7. Figure 4.7 The genomic context of the human HIF1A gene, after changing the dis...
8. Figure 4.8 Configuring the track settings for the Common SNPs(150) track. Set ...
9. Figure 4.9 The genomic context of the human HIF1A gene, after changing the col...
10. Figure 4.10 The GTEx Gene track, which depicts median gene expression levels i...
11. Figure 4.11 BLAT search at the UCSC Genome Browser. (a) This page shows the re...
12. Figure 4.12 Configuring the UCSC Table Browser. The link to the Table Browser ...
13. Figure 4.13 The home page of the Ensembl Genome Browser, showing a query for t...
14. Figure 4.14 The Gene tab for the human PAH gene. This landing page provides li...
15. Figure 4.15 Computationally predicted orthologs of the human PAH gene, from th...
16. Figure 4.16 The Location tab for the human PAH gene. The Location tab is divid...
17. Figure 4.17 Zooming in on the bottom section of the Location tab from Figure 4...
18. Figure 4.18 The Ensembl Variant tab. (a) To get more details about SNP rs76296...
19. Figure 4.19 The Ensembl Regulatory Build track. (a) Go to Configure this page ...
20. Figure 4.20 The Synteny view at Ensembl. (a) An overview of the syntenic block...
21. Figure 4.21 Ensembl BLAST output, showing an alignment between the human ADAM1...
22. Figure 4.22 Using BioMart to retrieve the mouse orthologs of the human RefSeqs...
23. Figure 4.23 JBrowse display of a predicted Mnemiopsis gene (ML05372a) from the...
Chapter 5
1. Figure 5.1 A simplified depiction of a prokaryotic gene or open reading frame ...
2. Figure 5.2 A simplified depiction of a eukaryotic gene illustrating the multi-...
3. Figure 5.3 A schematic illustration of the upstream regions of a eukaryotic ge...
4. Figure 5.4 A schematic illustration of the splice site regions around exons an...
5. Figure 5.5 Sample output from a GENSCAN analysis of the uroporphyrinogen decar...
6. Figure 5.6 Schematic representation of measures of gene prediction accuracy at...
7. Figure 5.7 The typical L-shaped structure of a tRNA molecule. This depicts the...
8. Figure 5.8 A screenshot montage of the PHASTER web server showing the website ...
9. Figure 5.9 A screenshot of a BASys bacterial genome annotation output for the ...
Chapter 6
1. Figure 6.1 The three levels of organization of RNA structure. (a) The primary ...
2. Figure 6.2 The RNA secondary structure of the 3′ untranslated region of the Dr...
3. Figure 6.3 An illustration of the equilibria of RNA structures in solution. (a...
4. Figure 6.4 Prediction of conformational free energy for a conformation of RNA ...
5. Figure 6.5 A simple RNA pseudoknot. This figure illustrates two representation...
6. Figure 6.6 The input form for the version 3.1 Mfold server. (a) The top and (b...
7. Figure 6.7 The output page for the Mfold server. Please refer to the text for ...
8. Figure 6.8 Sample output from the Mfold web server, version 3.1. (a) The secon...
9. Figure 6.9 RNAstructure web server input form. (a) The top and (b) the bottom ...
10. Figure 6.10 Sample output from the RNAstructure web server showing the predict...
11. Figure 6.11 Input form for the RNAstructure web server for multiple-sequence p...
12. Figure 6.12 Sample output from the RNAstructure web server for multiple-sequen...
Chapter 7
1. Figure 7.1 Dashboard of the PredictProtein web server. PredictProtein (Yachdav...
2. Figure 7.2 Protein secondary structure. Experimentally determined three-dimens...
3. Figure 7.3 Accessible surface area (ASA). The ASA describes the surface that i...
4. Figure 7.4 Protein secondary structure. Prediction of secondary structure, sol...
5. Figure 7.5 Types of transmembrane proteins. Experimentally determined three-di...
6. Figure 7.6 Transmembrane helix prediction by TMSEG. TMSEG (Bernhofer et al. 20...
7. Figure 7.7 Annotations of human tumor suppressor P53 (P53_HUMAN). (a) InterPro...
8. Figure 7.8 Prediction of subcellular localization. Visual output from LocTree3...
9. Figure 7.9 From predicting single amino acid sequence variant (SAV) effects to...
Chapter 8
1. Figure 8.1 An example multiple sequence alignment of seven globin protein sequ...
2. Figure 8.2 An outline of the simple progressive multiple alignment process. Th...
3. Figure 8.3 Aligner accuracy versus total single-threaded run time using the BA...
4. Figure 8.4 Total single-threaded execution time (y-axis) for different aligner...
5. Figure 8.5 Ratio of total run time relative to single-threaded execution (y-ax...
6. Figure 8.6 Protein and RNA multiple sequence alignments as visualized using Ja...
7. Figure 8.7 Linked coding sequence (CDS), protein, and three-dimensional struct...
Chapter 9
1. Figure 9.1 Different ways to visualize a tree. In this example, the same tree ...
2. Figure 9.2 Alignments illustrating sequence similarity versus sequence identit...
3. Figure 9.3 The differences between orthologs, paralogs, and xenologs. The ance...
4. Figure 9.4 The difference between phylogenetic signal and phylogenetic noise. ...
5. Figure 9.5 Character-based versus distance-based phylogenetic methods. Charact...
6. Figure 9.6 Rooting a tree with an outgroup. Escherichia coli bacteria are comm...
7. Figure 9.7 Workflow for a protein-based phylogenetic analysis using the PHYLIP...
8. Figure 9.8 Phylogenetic relationships can be visualized using different types ...
9. Figure 9.9 Excerpt of a Salmonella minimum spanning tree. Types of Salmonella ...
Chapter 10
1. Figure 10.1 Example of an MA plot before (a) and after (b) normalization. A, o...
2. Figure 10.2 Histogram of the base mismatch (MM) rate across multiple RNA-seq s...
3. Figure 10.3 Overview of quantile normalization. We start with the box on the t...
4. Figure 10.4 Batch effects principal components analysis (PCA) example. Boxplot...
5. Figure 10.5 A simple illustration of the process of hierarchical clustering. (...
6. Figure 10.6 Heatmap showing clustering of gene expression data of the 100 most...
7. Figure 10.7 First two components of principal component analysis (PCA) on the ...
8. Figure 10.8 Principal component analysis (PCA) is a dimensionality reduction m...
9. Figure 10.9 Illustration of how one can select k when performing consensus clu...
10. Figure 10.10 Receiver operating characteristic (ROC) curve for a model desig...
Chapter 11
1. Figure 11.1 Gene(s) to proteoforms. This figure illustrates the complexity of ...
2. Figure 11.2 Quadrupole mass analyzer. Schematic of a quadrupole mass analyzer,...
3. Figure 11.3 Time of flight (TOF) mass analyzer. Schematic of a TOF mass analyz...
4. Figure 11.4 (a) Tandem mass spectrometry (MS). Schematic of a triple quadrupol...
5. Figure 11.5 Fragmentation tandem mass spectrometry (MS/MS, or MS²) spectrum. A...
6. Figure 11.6 Polypeptide backbone cleavage produces different product ion speci...
7. Figure 11.7 Post-translational modifications (PTMs) take place at different am...
8. Figure 11.8 Data pre-processing workflow of a mass spectrum. Different steps i...
9. Figure 11.9 Shotgun proteomics workflow. Schematic showing different steps inv...
10. Figure 11.10 A schematic diagram comparing the label-free approach with the di...
11. Figure 11.11 Peptide mass fingerprinting (PMF) workflow. Schematic showing dif...
12. Figure 11.12 Mascot peptide mass fingerprinting (PMF). PMF submission screen a...
13. Figure 11.13 Peptide sequencing via tandem mass spectrometry (MS/MS) spectra i...
14. Figure 11.14 Peptide sequence tag searching. Schematic illustrating how a sequ...
15. Figure 11.15 Peptide spectrum match (PSM). Annotated MS² spectrum showing matc...
16. Figure 11.16 Mascot search engine. Mascot MS² database search submission windo...
17. Figure 11.17 Proteomics. A broad classification of proteomics and the biologic...
Chapter 12
1. Figure 12.1 A flow diagram illustrating the steps used to experimentally prepa...
2. Figure 12.2 An example of a nuclear magnetic resonance (NMR) “blurrogram” of a...
3. Figure 12.3 The different levels of protein structures illustrating: (a) prima...
4. Figure 12.4 Examples of different types of protein folds including (a) the fou...
5. Figure 12.5 An illustration of standard amino acid residue and peptide bond ge...
6. Figure 12.6 An example of a Protein Data Bank formatted file showing the first...
7. Figure 12.7 A Ramachandran plot for the thioredoxin protein (Protein Data Bank...
8. Figure 12.8 A screenshot of the Research Collaboratory for Structural Bioinfor...
9. Figure 12.9 A screenshot of an image of Escherichia coli thioredoxin as genera...
10. Figure 12.10 An illustration of the four major approaches to rendering protein...
11. Figure 12.11 An example of the high-quality images that can be created using a...
12. Figure 12.12 An illustration of a homology model (b) of Escherichia coli thior...
13. Figure 12.13 A schematic illustration of how threading is performed. (a) A que...
14. Figure 12.14 An example of the high-quality postscript output data from PROCHE...
15. Figure 12.15 An example of the CATH database description of Escherichia coli t...
Chapter 13
1. Figure 13.1 The Reactome database pathway view. The central view shows pathway...
2. Figure 13.2 The EcoCyc database cellular overview of Escherichia coli metaboli...
3. Figure 13.3 An example of metabolic pathway reconstruction from Kyoto Encyclop...
4. Figure 13.4 A BioGRID database record. A screenshot of the result page for a B...
5. Figure 13.5 An IntAct database search for the human MDM2 gene. A summary of al...
6. Figure 13.6 An example of the main STRING query result page. A network of rela...
7. Figure 13.7 A query result from GeneMANIA. Each node in the network represents...
8. Figure 13.8 The AKT pathway as represented by a traditional method (top left, ...
9. Figure 13.9 The main components of the Proteomics Standards Initiative–Molecul...
10. Figure 13.10 The valine biosynthesis pathway dynamically drawn by the Pathway ...
11. Figure 13.11 Output from the PathVisio software showing a portion of a human c...
12. Figure 13.12 The set of symbol types available in the Systems Biology Graphica...
13. Figure 13.13 The Drosophila melanogaster cell cycle drawn using Systems Biolog...
14. Figure 13.14 The results of pathway enrichment analysis using the g:Profiler t...
15. Figure 13.15 A Gene Set Enrichment Analysis (GSEA) enrichment figure. The bott...
16. Figure 13.16 An enrichment map showing two enriched themes. Each node represen...
17. Figure 13.17 An introduction to terminology and visual notation used in the co...
18. Figure 13.18 Zooming in on a network in Cytoscape shows part of a large connec...
19. Figure 13.19 An overview of a pathway analysis workflow, summarizing multiple ...
Chapter 14
1. Figure 14.1 A diagram illustrating the typical workflow for a metabolomic expe...
2. Figure 14.2 An example of a Molecular Design Limited (MDL) chemical fingerprin...
3. Figure 14.3 An example of a MOL file for a two-dimensional representation of L...
4. Figure 14.4 An example of an nmrML data file for L-alanine. The actual file is...
5. Figure 14.5 The JSpectraViewer image for L-alanine. JSpectraViewer is a Java a...
6. Figure 14.6 A selection of two screenshots from the PubChem web pages for the ...
7. Figure 14.7 Two screenshots of the gas chromatography–mass spectrometry (GC-MS...
8. Figure 14.8 Two screenshots from the Human Metabolome Database (HMDB) entry fo...
9. Figure 14.9 A simplified illustration of how spectral deconvolution works for ...
10. Figure 14.10 Two screenshots of the Bayesil web server. (a) A nuclear magnetic...
11. Figure 14.11 An illustration of how spectral deconvolution works for gas chrom...
12. Figure 14.12 An illustration of how principal component analysis can be though...
13. Figure 14.13 A three-dimensional principal component analysis (PCA) “scores” p...
14. Figure 14.14 The MetaboAnalyst Module Overview page. This page allows users to...
15. Figure 14.15 The MetaboAnalyst Data Upload page. This page allows users to upl...
16. Figure 14.16 The MetaboAnalyst Data Normalization page. The optimal normalizat...
17. Figure 14.17 The MetaboAnalyst Data Normalization and Scaling results, generat...
18. Figure 14.18 A two-dimensional principal component analysis (PCA) “scores” plo...
19. Figure 14.19 The principal component analysis (PCA) “loadings” plot, showing t...
20. Figure 14.20 The partial least squares discriminant analysis (PLS-DA) plot sho...
21. Figure 14.21 An example of an R²/Q² plot generated by MetaboAnalyst using the ...
22. Figure 14.22 A variable importance in projection plot showing which metabolite...
23. Figure 14.23 A pathway impact plot showing the importance of different pathway...
Chapter 15
1. Figure 15.1 Principal components analysis (PCA) of nine world populations and ...
2. Figure 15.2 The coalescent process. Although the ancestral population contains...
3. Figure 15.3 Multiple sequentially Markovian coalescent (MSMC) estimate of popu...
4. Figure 15.4 Admixture analysis of nine populations and three test samples. Ind...
5. Figure 15.5 A Manhattan plot of Composite of Multiple Signals (CMS) scores (Y-...
Chapter 16
1. Figure 16.1 General workflow for DNA-based microbiome analysis.
2. Figure 16.2 FastQC summary of DNA sequence read quality for an Illumina sequen...
3. Figure 16.3 Primary structure and variable regions of the 16S ribosomal RNA ge...
4. Figure 16.4 k-mer decomposition of a nucleotide sequence with k = 2. Two seque...
5. Figure 16.5 Rarefaction curves for microbial communities sampled from six diff...
6. Figure 16.6 Unweighted phylogenetic alpha- and beta-diversity measures. Left: ...
7. Figure 16.7 Principal coordinate analysis (a) vs. non-metric multidimensional ...
8. Figure 16.8 Visualizing the differences between two groups of gut microbiome s...
Chapter 17
1. Figure 17.1 ClinVar entry for a benign variant in the cystic fibrosis gene (CF...
2. Figure 17.2 Receiver operating characteristic (ROCs) curves of five submissi...
Chapter 18
1. Figure 18.1 Relationships between observation, data, information, and knowledg...
2. Figure 18.2 Types of variables and their hierarchical relationships.
3. Figure 18.3 Organization of an example dataset. (a) Part of a two-dimensional ...
4. Figure 18.4 Commonly used descriptive statistics for sample variables. Light b...
5. Figure 18.5 Covariance versus correlation. The red sample has higher sample va...
6. Figure 18.6 Example histogram demonstrating the frequency of black cherry tree...
7. Figure 18.7 Example boxplot and related variant graphs. (a) Schematic diagram ...
8. Figure 18.8 Anscombe's quartet. Scatterplots with regression lines for four fa...
9. Figure 18.9 Scatterplot of the first two principal components (PCs) from princ...
10. Figure 18.10 Example of how to make a graph descriptive.
11. Figure 18.11 The standard normal distribution.
12. Figure 18.12 Other well-described discrete and continuous distributions common...
13. Figure 18.13 Bond length and coordination angle histograms for coordinated met...
14. Figure 18.14 Overview of the process of statistical inference. FUV stands for ...
15. Figure 18.15 Truth table with descriptions of type I and II errors.
16. Figure 18.16 Diagram illustrating the relationships between a probability dens...
17. Figure 18.17 Using a Student's t-test to test a null hypothesis.
18. Figure 18.18 Relationship between population and sample mean distributions..
19. Figure 18.19 An approximate power analysis diagram for a Student's t-test.
Appendices
1. Figure 6.A.1 Pseudo-computer code for the fill order of V(i,j) and W(i,j). Thi...
2. Figure 6.A.2 The filled V(i,j) array for sequence GCGGGUACCGAUCGUCGC.
3. Figure 6.A.3 The filled W(i,j) array for sequence GCGGGUACCGAUCGUCGC.
4. Figure 6.A.4 Illustrations of maximum hydrogen bond conformations as found by ...
5. Figure 6.A.5 Flowchart for structure traceback. Traceback starts by placing 1,...
6. Figure 6.A.6 The secondary structure of rGCGGGUACCGAUCGUCGC with 17 hydrogen b...