|
Workshops for the Bio-Scientist
Due to the rapid development of
molecular
biology techniques, biological research has undergone profound changes
over the past years. Sequencing as well as gene and protein analysis
are almost fully automated and performed on a high throughput basis,
generating large sets of biological data. The need for the
computational analyses of such complex data sets is apparent. To
understand the uses and limitations of some of the analytical tools
that are now available is critical for every scientist who uses
bioinformatics tools for data analysis. The BioTools.info
workshops for the Bio-Scientist focus on the resources available at the
National Center for Biotechnology Information (NCBI). However, many
other online databases and tools will be introduced and discussed.
- Participants should be familiar with the basic
molecular and genetic concepts and terminology, as well as be familiar
with Web-Browser.
Bio-scientific Databases: Scope
& Search Strategies
The data sets, which accumulate from sequencing
projects, grow due to high throughput methods exponentially. GenBank,
the sequence depot of all publicly available nucleotide and protein
sequences of the Natl. Center for Biotechnology Information (NCBI),
contains at present over 85.5 million sequence entries (April 2008).
With data bases of this size it is difficult to filter the desired
information if the user does not formulate the retrieval query in exact
terms. This workshop covers search strategies, including the details of
the Entrez search engine, as well as resources that can serve as
starting points
for questions pertaining to molecular biology and genomics databases
and software. The following topics are the center of attention:
- Indexation of data and choosing the appropriate
data
base for a research question
- Detailed discussion of the databases GenBank
and
RefSeq
- Detailed discussion of Entrez search functions
for
integrated information retrieval
- Search strategies, including narrowing and
broadening
your search as well as
- how to find search terms and
using the
correct gene names
Similarity Searching using
BLAST
The family of Basic Local Alignment Search Tools
(BLAST) programs provides a powerful way to compare a query sequence
against a sequence database. Similarity searching can help to reveal a
putative identity and function of the molecular sequence. This hands-on
workshop introduces the terms and algorithms to perform sequence
similarity searching using BLAST. The following topics are the center
of attention:
- Theoretical background of the BLAST algorithm
- Choosing the appropriate search program
- Choosing the appropriate search parameters to
optimise a sequence similarity search
- Formatting and analysis of the results
- Application of PSI-BLAST searches
- Appilications of RPS, BLAST2Seq,
Genomic
BLAST and VecScreen
Genome Maps & Genome Browser
(focus: NCBI Map Viewer)
A large
challenge of Genome projects consists not only of organizing the
quantity of data, but also of the data analysis and interpretation.
With the help of Genome Browser these data can be displayed as
annotated diagrams of chromosomes of eukaryotic organisms. The data
shown are of different kind and origin. This workshop covers resources
to view genomic data on maps of eukaryotic genomes. The following
topics are the center of attention:
- Theoretical background of the
different maps that are available (e.g. sequence maps, cytogenetic
maps, "radiation Hybrid" maps) and the choice of the appropriate map
for a research question
- Detailed discussion of NCBI’s MapViewer, a
software
for the search,
display and manipulation of maps of chromosomes
- The use of the MapViewer for the localization
of
genes, markers and
DNA- polymorphisms, as to the gene structure and sequence analysis.
- Applications like the analysis of syntenies,
downloeading options and link-outs will be discussed.
Find & analyse DNA-Polymorphisms
The differences in the genotype
of different individuals are responsible for their different
manifestations. Today genetic variations are identified by using modern
methods of molecular biology and deposited in databases. This workshop
covers resources, search strategies and the analysis of genetic
variations. The following topics are the center of attention:
- Different kinds of polymorphisms
- Introduction to the techniques to localize DNA
polymorphisms
- Using Entrez to search the databases OMIM,
PopSet and
dbSNP
- Evaluating polymorphisms
- The SNP-Consortium, the resources of the
CGAP (Cancer Genome Anatomy Project) and the GAI (Gene Annotation
Initiative)
Protein Sequence Search &
Protein
Analysis
Tools
In the genome of
mammals, approx. 30,000 genes code for a still unknown number of
proteins, the proteome. Many genomes of micro organisms and more and
more eukaryotes have reached completion and the genomes/proteomes can
now be compared with one another. The new development in research
sometimes reveals e.g. metabolic pathways of bacteria, which can be
turn on by the bacteria under special conditions, or enhances the
development of new drug targets. This workshop covers protein sequence
data bases, search retrieval and protein/proteome analysis tools. The
following topics could be the center of attention:
- Introduction of UniProt
(SWISS Prot/TrEMBL), UniRef and UniParc, the new data bases of the
UniProt of consortium for universal protein information (new: January.
2004)
- Applications of the data bases PROSITE and
ENZYMES
- Introduction to the Sequence Retrieval System
(SRS)
- Proteome comparison between different organisms
with
applications of
the NCBI resources TaxMap, the “Clusters of orthologous Groups” data
bases COGs and KOGs, and HomoloGene
- Resources at the Expert Protein Analysis System
(ExPASY)-Server
Viewing Molecular Structures with
3D-Software (focus: Cn3D)
The protein function
is based on the three-dimensional structure of a protein. The
three-dimensional structure results from the primary structure of the
protein and is dependent on other factors, e.g. bound molecules or the
local environment. With the 3D-structure at hand, both protein
interactions and protein functions can be studied. In addition, effects
from mutations can be visualized and can help to interpret the effects
of the mutation for the protein. This workshop focuses on handling
3D-software to view molecular structures. The following topics are the
center of attention:
- Data bases and data base search for molecular
structures (Modeling DataBase (MMDB), Protein DataBank (PDB))
- Concepts for viewing molecular structures
- Application of Cn3D, the 3D-Viewer of the NCBI,
for
viewing and
manipulating molecular structures
- Differentiation between sequence vs. structure
alignments and the
interpretation of the alignments
DNA-Tools:
Restriction-Enzyme-Analysis,
Primer-design, Multiple
Alignments
This workshop introduces sequence analysis tools
that are
freely available on the WWW. They help to plan the experiments in the
laboratory purposefully and help to control the result of the
experiment. The following topics are the center of attention:
- DNA-Restriction-Enzyme-Analysis in silico:
Which enzyme cuts where and
how often, which enzymes do not cut in the sequence?
- PCR primers:
primer dimer formation, secondary structures in primer, priming in
template DNA, PCR conditions
- Multiple alignments with the help of Clustal W
and what to do with
them
Resources for the
Analysis of Gen Expressions
Cells differ due to the
difference in their gene expression patterns. The quality and quantity
of the gene expression is the mechanism for the cell differentiation
and the development of an organism. This workshop introduces resources
for the search and analysis of gene expression data. The following
topics are the center of attention:
- Introduction to the methods to
obtain gene expression data
- The array experiment and the validation of the
experiment by
independent methods
- Data bases and resources for the location and
analysis of gene
expression data
- NCBI’s Gene Expression Omnibus (GEO)
- SAGE-Genie
(CGAP)
- SAGE-map (NCBI)
- Introduction to the evaluation of gene
expression
data
Clinical
Genetics Resources
Many Web tools deal with genetic disorders and
their
molecular conjunction. This workshop covers searching general as well
as specialized resources to answer questions that for the clinical
geneticist. The following
topics are the center of attention:
- NCBI's database of Mendelian
Inheritance in Men (OMIM)
- GeneTestsáGeneClinics, a resource
providing information on Genetic Testing
- GeneCards, an
Enzyclopedia of human genes with extensive link out
The spezialised
resources include
- NCI's Cancer Genome Anatomy Project (CGAP)
- NCBI's Spectral Karyotyping (SKY) and
Comparative
Genome Hybridization (CGH)
for Cancer aberrations
- Jablonski's database of Multiple Congenital
Anomaly/Mental Retardation (MCA/MRS) Syndromes
- OrphaNet, the
database of rare diseases.
- More resources like HuGENet and the Atlas of
Genetics and Cytogenetics, are also introduced.
Searching MEDLINE on
PubMed
This workshop covers the literature
search in PubMed, the free literature database for medical literature
of the Natl. Library of Medicine. The following topics are the center
of attention:
- The origins and the scope of PubMed
- Search strategies, search parameters and
extended search parameters
- Searching In and With the help of Medical
Subject Headings (MeSH)
- The Journal database
- Other points of discussion: PubMed Central and
the NLM Gateway
Searching for Patents in Open Sources
Given the unique structure of patents, searching
for
valuable
scientific information "buried" in Patents can be quite challenging. In
addition to the
commercial databases, several freely available
internet resources can also be used for patent searching. This workshop
will explain
document codes such as dates and classifications and how these can be
used for searches
in open sources such as esp@cenet (EPO) and depatisnet (DPMA).
|