We studied the frequency distribution of 1,048,576 oligonucleotides 10 bp long in a sample of 1.072 x 10(6) bases of genes from non-mammalian vertebrates, made of 322 sequences extracted from EMBL(R) 29.0, with the aim of detecting transcription control signals. Among all decamers, 2097 (0.2%) had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 2097 decamers (parents), we counted the individual frequencies of the 30 decamers differing from the parent by one base mutation (progeny) and we calculated two variance/mean chi squares for the progeny, with and without the parent decamer. By studying the distribution of the ratio between the two chi squares we observed that out of 2097 decamers that occurred > 10 times more frequently than average, 1017 had a chi square ratio of between 1 and 1.5; in this final set, which corresponds to < 0.097% of all possible decamers, 75 decamers were found to contain 100 transcription control elements, like CCAAT and others. The final set contains a high excess of signals when compared to 100 random sets of 1017 decamers. Some of the decamers selected with the procedure are members of consensus sequences rather than unique sequences.

ENRICHMENT OF OLIGONUCLEOTIDE SETS WITH TRANSCRIPTION CONTROL SIGNALS .3. DNA FROM NONMAMMALIAN VERTEBRATES

SCAPOLI, Chiara;VOLINIA, Stefano;
1993

Abstract

We studied the frequency distribution of 1,048,576 oligonucleotides 10 bp long in a sample of 1.072 x 10(6) bases of genes from non-mammalian vertebrates, made of 322 sequences extracted from EMBL(R) 29.0, with the aim of detecting transcription control signals. Among all decamers, 2097 (0.2%) had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 2097 decamers (parents), we counted the individual frequencies of the 30 decamers differing from the parent by one base mutation (progeny) and we calculated two variance/mean chi squares for the progeny, with and without the parent decamer. By studying the distribution of the ratio between the two chi squares we observed that out of 2097 decamers that occurred > 10 times more frequently than average, 1017 had a chi square ratio of between 1 and 1.5; in this final set, which corresponds to < 0.097% of all possible decamers, 75 decamers were found to contain 100 transcription control elements, like CCAAT and others. The final set contains a high excess of signals when compared to 100 random sets of 1017 decamers. Some of the decamers selected with the procedure are members of consensus sequences rather than unique sequences.
1993
Scapoli, Chiara; Rodriguezlarralde, A; Volinia, Stefano; Barrai, I.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/463103
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact