Return to Predicted E. coli regulons


For each cluster, the following information is provided.

For each motif that is a member of the cluster, the following information is provided.

	purM-1 	this is the "name" of the motif;
		this indicates that this motif was found upstream of the 
		E. coli gene purM; the "-1" is just a numeric for this motif 
		to distinguish it from other motifs for this gene (eg: purM-2).
	(upp)	the name of the divergently transcribed gene (purM and upp
		are divergently transcribed in E. coli); this information is
		provided if, and only if, the motif (purM-1) was detected when
		independently examining the orthologous promoter regions of 
		these genes during phylogenetic footprinting; if this occurred, 
		only one of the motifs was included in clustering to avoid 
		duplication.
	480/500	this ratio indicates the number of times (480) that a
		motif was sampled into this cluster, from the total number of
		iterations (500) during which that particular cluster was active;
		motifs are shown in clusters if they were members for 
		greater than 25% of the iterations during which the cluster 
		was active; clusters active for less than half the total 
		number of iterations performed during sampling are not shown
		(in these results, clusters active for 250 or fewer iterations 
		are not shown).

Gene names

The gene names are those used in the GenBank entry of the E. coli K12 genome (U00096, 17-MAY-1999).

Cluster models

The matrices contain proportional (floating point) counts. Each row is a successive position in the matrix and the columns are nucleic acid types, A T C G. The matrix is constucted by summing the counts from each motif in the cluster (at each position, for each nucleic acid type) in proportion to the amount of time the motif spent in the cluster. eg, if a motif's proportion is 80%, 80% of its counts at each position are used in the summation process.