Genomic Signal Processing (Princeton Series in Applied Mathematics) - Hardcover

Buch 21 von 33: Princeton Series in Applied Mathematics

Shmulevich, Ilya; Dougherty, Edward R.

 
9780691117621: Genomic Signal Processing (Princeton Series in Applied Mathematics)

Inhaltsangabe

Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathematical definitions and propositions for the main elements of GSP and by paying attention to the validity of models relative to the data. Ilya Shmulevich and Edward Dougherty cover real-world situations and explain their mathematical modeling in relation to systems biology and systems medicine.



Genomic Signal Processing makes a major contribution to computational biology, systems biology, and translational genomics by providing a self-contained explanation of the fundamental mathematical issues facing researchers in four areas: classification, clustering, network modeling, and network intervention.

Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.

Über die Autorin bzw. den Autor

Ilya Shmulevich, an associate professor at the Institute for Systems Biology, is the coauthor of Microarray Quality Control and the coeditor of Computational and Statistical Approaches to Genomics. Edward R. Dougherty is professor of electrical and computer engineering and director of the Genomic Signal Processing Laboratory at Texas A&M University, and director of the Computational Biology Division at the Translational Genomics Research Institute. His thirteen previous books include Random Processes for Image and Signal Processing.

Von der hinteren Coverseite

"There is a genuine need for this concise, informative, clearly written book. In systems biology, engineers, mathematicians, and computer scientists are collaborating increasingly with biologists and researchers in medicine. This book goes a long way toward narrowing the gap on this front, and it lays a rigorous foundation for a new discipline."--Olli Yli-Harja, Tampere University of Technology

Aus dem Klappentext

"There is a genuine need for this concise, informative, clearly written book. In systems biology, engineers, mathematicians, and computer scientists are collaborating increasingly with biologists and researchers in medicine. This book goes a long way toward narrowing the gap on this front, and it lays a rigorous foundation for a new discipline."--Olli Yli-Harja, Tampere University of Technology

Auszug. © Genehmigter Nachdruck. Alle Rechte vorbehalten.

Genomic Signal Processing

By Ilya Shmulevich Edward R. Dougherty

Princeton University Press

Copyright © 2007 Princeton University Press
All right reserved.

ISBN: 978-0-691-11762-1

Chapter One

Biological Foundations

No single agreed-upon definition seems to exist for the term bioinformatics, which has been used to mean a variety of things ranging in scope and focus. To cite but a few examples from textbooks, Lodish et al. (2000) state that "bioinformatics is the rapidly developing area of computer science devoted to collecting, organizing, and analyzing DNA and protein sequences." A more general and encompassing definition, given by Brown (2002), is that bioinformatics is "the use of computer methods in studies of genomes." More general still: "bioinformatics is the science of refining biological information into biological knowledge using computers" (Draghici, 2003). Kohane et al. (2003) observe that the "breadth of this commonly used definition of bioinformatics risks relegating it to the dustbin of labels too general to be useful" and advocate being more specific about the particular bioinformatics techniques employed.

While it is true that the field of bioinformatics has traditionally dealt primarily with biological data encoded in digital symbol sequences, such as nucleotide and amino acid sequences, in this book we will be mainly concerned with extracting information from gene expression measurements and genomic signals. By the latter we mean any measurable events, principally the production of messenger ribonucleic acid (RNA) and protein, that are carried out by the genome. The analysis, processing, and use of genomic signals for gaining biological knowledge and translating this knowedge into systems-based applications is called genomic signal processing.

In this chapter, our aim is to place this material into a proper biological context by providing the necessary background for some of the key concepts that we shall use. We cannot hope to comprehensively cover the topics of modern genetics, genomics, cell biology, and others, so we will confine ourselves to brief overviews of some of these topics. We particularly recommend the book by Alberts et al. (2002) for a more comprehensive coverage of these topics.

1.1 GENETICS

Broadly speaking, genetics is the study of genes. The latter can be studied from different perspectives and on a molecular, cellular, population, or evolutionary level. A gene is composed of deoxyribonucleic acid (DNA), which is a double helix consisting of two intertwined and complementary nucleotide chains. The entire set of DNA is the genome of the organism. The DNA molecules in the genome are assembled into chromosomes, and genes are the functional regions of DNA.

Each gene encodes information about the structure and functionality of some protein produced in the cell. Proteins in turn are the machinery of the cell and the major determinants of its properties. Proteins can carry out a number of tasks, such as catalyzing reactions, transporting oxygen, regulating the production of other proteins, and many others. The way proteins are encoded by genes involves two major steps: transcription and translation. Transcription refers to the process of copying the information encoded in the DNA into a molecule called messenger RNA (mRNA). Many copies of the same RNA can be produced from only a single copy of DNA, which ultimately allows the cell to make large amounts of proteins. This occurs by means of the process referred to as translation, which converts mRNA into chains of linked amino acids called polypeptides. Polypeptides can combine with other polypeptides or act on their own to form the actual proteins. The flow of information from DNA to RNA to protein is known as the central dogma of molecular biology. Although it is mostly correct, there are a number of modifications that need to be made. These include the processes of reverse transcription, RNA editing, and RNA replication.

Briefly, reverse transcription refers to the conversion of a single-stranded RNA molecule to a double-stranded DNA molecule with the help of an enzyme aptly called reverse transcriptase. For example, HIV virus consists of an RNA genome that is converted to DNA and inserted into the genome of the host. RNA editing refers to the alteration of RNA after it has been transcribed from DNA. Therefore, the ultimate protein product that results from the edited RNA molecule does not correspond to what was originally encoded in the DNA. Finally, RNA replication is a process whereby RNA can be copied into RNA without the use of DNA. Several viruses, such as hepatitis C virus, employ this mechanism. We will now discuss some preliminary concepts in more detail.

1.1.1 Nucleic Acid Structure

Almost every cell in an organism contains the same DNA content. Every time a cell divides, this material is faithfully replicated. The information stored in the DNA is used to code for the expressed proteins by means of transcription and translation. The DNA molecule is a polymer that is strung together from monomers called deoxyribonucleotides, or simply nucleotides, each of which consists of three chemical components: a sugar (deoxyribose), a phosphate group, and a nitrogenous base. There are four possible bases: adenine, guanine, cytosine, and thymine, often abbreviated as A, G, C, and T, respectively. Adenine and guanine are purines and have bicyclic structures (two fused rings), whereas cytosine and thymine are pyrimidines, and have monocyclic structures. The sugar has five carbon atoms that are typically numbered from 1' to 5'. The phosphate group is attached to the 5'-carbon atom, whereas the base is attached to the 1' carbon. The 3' carbon also has a hydroxyl group (OH) attached to it.

Figure 1.1 illustrates the structure of a nucleotide with a thymine base. Although this figure shows one phosphate group, up to three phosphates can be attached. For example, adenosine 5'-triphosphate (ATP), which has three phosphates, is the molecule responsible for supplying energy for many biochemical cellular processes.

Ribonucleic acid is a polymer that is quite close in structure to DNA. One of the differences is that in RNA the sugar is ribose rather than deoxyribose. While the latter has a hydrogen at the 2' position (figure 1.1), ribose has a hydroxyl group at this position. Another difference is that the thymine base is replaced by the structurally similar uracil (U) base in a ribonucleotide.

The deoxyribonucleotides in DNA and the ribonucleotides in RNA are joined by the covalent linkage of a phosphate group where one bond is between the phosphate and the 5' carbon of deoxyribose and the other bond is between the phosphate and the 3' carbon of deoxyribose. This type of linkage is called a phosphodiester bond. The arrangement just described gives the molecule a 5'->3' polarity or directionality. Because of this, it is a convention to write the sequences of nucleotides starting with the 5' end at the left, for example, 5'-ATCGGCTC-3'. Figure 1.2 is a simplified diagram of the phosphodiester bonds and the covalent structure of a DNA strand.

DNA commonly occurs in nature as two strands of nucleotides twisted around in a double helix, with the repeating phosphate-deoxyribose sugar polymer serving as the backbone. This backbone is on the outside of the helix, and the bases are located in the center. The opposite strands are joined by hydrogen bonding between the bases, forming base pairs. The two backbones are in opposite or antiparallel...

„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.