©1996-2007 All Rights Reserved. Online Journal of Bioinformatics . You may not store these pages in any form except for your own personal use. All other usage or distribution is illegal under international copyright treaties. Permission to use any of these pages in any other way besides the before mentioned must be gained in writing from the publisher. This article is exclusively copyrighted in its entirety to OJB publications. This article may be copied once but may not be, reproduced or re-transmitted without the express permission of the editors. This journal satisfies the refereeing requirements (DEST) for the Higher Education Research Data Collection (Australia). Linking:To link to this page or any pages linking to this page you must link directly to this page only here rather than put up your own page.
OJBTM
Online Journal of Bioinformatics©
8
(2) : 139-153,
2007
Shape-to-String
Mapping: A Novel Approach To
Clustering Time-Index Biomics Data
Antoine W1,
Miernyk JA1,2,3
1Department of
Biochemistry 2USDA,
Agricultural Research
Service, Plant Genetics Research Unit and 3Interdisciplinary
Plant Group,
ABSTRACT
Antoine W, Miernyk JA, Shape-to-String
Mapping: A Novel Approach To Clustering Time-Index Biomics
Data, Online
Journal of
Bioinformatics 8 (2):139-153,
2007. Herein we describe a
qualitative approach for
clustering time-index biomics data. The
data are
transformed into angles from the intensity-ratios between adjacent
time-points.
A code is used to map a qualitative
representation of the numerical time-index data which captures the
features in
the data that define the shape of the pattern expression as a function
of time.
The problem of clustering time-index biomics data is then either solved directly or
reduced to a
problem similar to the well-studied task of clustering protein sequence
data. For datasets with few time points,
the words
derived from the transformation are adequate to define clusters. Dissimilarities between the newly defined
objects can be estimated, and the distance matrix can be used for
further
clustering. The results from transcript
profiling of developing soybean embryo have been used to illustrate the
utility
of the method. Comparative
mapping of the intensity-ratios and the angles by multidimensional
scaling and Procrustes analysis revealed
otherwise cryptic information
within the data. The Euclidian
distance matrices were calculated from the words and corresponding gene
list
using the PHYLogeny Inference
Package (PHYLIP) algorithms and the Point of Accepted Mutation
(PAM)
scores matrix to compare the effectiveness of the code in clustering
the data.
Key words: String Map, Cluster, Biomics