We studied the frequency distribution of 1,048,576 oligonucleotides 10 bp long in a sample of 1.961 Mbase of genes from plants, made of 635 sequences extracted from GenBank 71.0, with the aim of detecting transcription control signals. Among all decamers, 3255, or 0.3%, had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 3255 decamers (parents), we counted the individual frequencies of the 30 decamers (progeny) differing from the parent by one base mutation, and calculated two variance/mean chi-squares for the progeny, with and without the parent decamer. By studying the distribution of the ratio between the two chi-squares we observed that out of 3255 decamers > 10 times frequent than average, 432 had a chi-square ratio > 1.9. In this residual set, which corresponds to < 0.04 per cent of all possible decamers, only 15 known eukaryotic transcription control elements were found; on the other hand, it included 29 decanucleotides that matched with decanucleotides of a set of Drosophila, 24 with a set from mammals, 13 with a set from yeast and four with a set of viruses--all sets identified with the statistical procedures here described. These decanucloetides are highly repetitive and seem to be present throughout all higher organisms, whereas they are uncommon in mammalian viruses.

IDENTIFICATION OF A SET OF FREQUENT DECANUCLEOTIDES IN PLANTS AND IN ANIMALS

SCAPOLI, Chiara;VOLINIA, Stefano;
1994

Abstract

We studied the frequency distribution of 1,048,576 oligonucleotides 10 bp long in a sample of 1.961 Mbase of genes from plants, made of 635 sequences extracted from GenBank 71.0, with the aim of detecting transcription control signals. Among all decamers, 3255, or 0.3%, had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 3255 decamers (parents), we counted the individual frequencies of the 30 decamers (progeny) differing from the parent by one base mutation, and calculated two variance/mean chi-squares for the progeny, with and without the parent decamer. By studying the distribution of the ratio between the two chi-squares we observed that out of 3255 decamers > 10 times frequent than average, 432 had a chi-square ratio > 1.9. In this residual set, which corresponds to < 0.04 per cent of all possible decamers, only 15 known eukaryotic transcription control elements were found; on the other hand, it included 29 decanucleotides that matched with decanucleotides of a set of Drosophila, 24 with a set from mammals, 13 with a set from yeast and four with a set of viruses--all sets identified with the statistical procedures here described. These decanucloetides are highly repetitive and seem to be present throughout all higher organisms, whereas they are uncommon in mammalian viruses.
1994
Scapoli, Chiara; Rodriguezlarralde, A; Volinia, Stefano; Beretta, M; Barrai, I.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/463099
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact