RSAT Matrix Scan

What it does

Scan a DNA sequence with a profile matrix.


Example with transfac matrix and fasta file:

AC  m1
XX
ID  m1
XX
DE  m1 m1; from JASPAR
P0       A     C     G     T
1        4    16     0     0
2       19     0     1     0
3        0    20     0     0
4        0     0    20     0
5        0     0     0    20
6        0     0    20     0
XX
CC  program: jaspar
CC  matrix.nb: 1
CC  min.prior: 0.25
CC  alphabet.size: 4
CC  max.bits: 2
CC  total.information: 6.56407409450406
CC  information.per.column: 1.09401234908401
CC  max.possible.info.per.col: 1.38629436111989
CC  consensus.strict: CACGTG
CC  consensus.strict.rc: CACGTG
CC  consensus.IUPAC: CACGTG
CC  consensus.IUPAC.rc: CACGTG
CC  consensus.regexp: CACGTG
CC  consensus.regexp.rc: CACGTG
CC  residues.content.crude.freq: a:0.1917|c:0.3000|g:0.3417|t:0.1667
CC  G+C.content.crude.freq: 0.641666666666667
CC  residues.content.corrected.freq: a:0.1944|c:0.2976|g:0.3373|t:0.1706
CC  G+C.content.corrected.freq: 0.634920634920635
XX
//

>mm9_chr3_121848111_121848740_+
tggggtgggttccaggacagccaggactgtcacacaaagaaaccttgtctcaaaaaaaca
aaaCAAAAAACAAAACAAAACCAAGCAAGCAAGCAAACATGGGCTTAAATctggatacag
tggcctttatttctagttccagtgactgggagactgaaacaagagagtcacttgagtaca
ggagtgcaaggctagcttgagcaatatagtaagactatctcaaaaTGTGAATTtagatca
acagaattgacatcaagaaaaatactgatatcactcaaagcaatctacagattcaacaca
atctccatcaacatgacaatgacttccatcaGCATGACAATGACTCCATCAACATGCCAA
TGGGCCCCATCAACATAACAATGACCCCTATCATCATGACAATGATCCCCATCAACATGA
CAATGACCTCCATCAACATGACAATTACTCCTGTCAACATGCCAATtgttggggttcaga
agtcaccctgcaaaccacaagaacactaatctcagtcaagcagggatggtttactgaacg
tatatccaaagactgagtgaccaagggaacagctcagactctagagctgaaagctagctg
tgcgctggacatttctcggggccaactta
>mm9_chr14_86795691_86796311_+
CTCAAGGAGGATCCAGAAGTTGGCAGTTTCTGAGGCGAGTCCCATATTCCTCCCCTAAGG
GGTCAGGATTTTTCAGGTCTGGGCTCTTCTTGTTCTTTTGACAGCGACATTAATAATTGT
ACCAGCTCTCCCCTGGCAGGGCCGCACCACAGAGTAAAGCCTGGAGTAGGAGCTGTGCCC
AGCGCAATAATACCAGTTAAATAAGTACGTTCATTACCTCCCAACAGTCAAGGAGTTTAA
AATCCGTCAATTCACCCCACATGAGGGAGATTATGTGATTTACATGTTAAAGTGCCCCTG
TGGTTTGATTTGCATAGCAAAGACTTTGGGGGCACAGAAACAAAGCATCCGCATGCATGA
CAAGAGGACTATTAGCATAGAGAGAGCAGGTTTTCCGACAGCCCAGCCTGGCAAACAATG
CTGCACCTTCGCTGCTCGCTGGAGTTTATAGGATTTGACAGTTTCTCATCTGAGGGAGGA
GAAGGTTAGGGAGTTGGGGTGGAATGAGGTTCGCTAGATTGGCTGTCTGCCTTAAGCACA
ATAATTTGTCTTTCTCTGAAGACCTCCCCTCCTCTTATCACCTTTATCGTTTCTTTCTGA
TGTTCATTCAAAGACGTCTT

Results with threshold (sites + pval) 0.0001:

#seq_id        ft_type ft_name strand  start   end     sequence        weight  Pval    ln_Pval sig
mm9_chr3_121848111_121848740_+ limit   START_END       D       1       629     .       0       0       0       0
mm9_chr14_86795691_86796311_+  limit   START_END       D       1       620     .       0       0       0       0
mm9_chr15_84452311_84453000_+  limit   START_END       D       1       689     .       0       0       0       0
mm9_chr1_134118761_134119120_+ limit   START_END       D       1       359     .       0       0       0       0
mm9_chr17_35640731_35641360_+  limit   START_END       D       1       629     .       0       0       0       0
mm9_chr18_66052791_66053190_+  limit   START_END       D       1       399     .       0       0       0       0
mm9_chr6_125165051_125165561_+ limit   START_END       D       1       510     .       0       0       0       0
mm9_chr16_35588051_35588520_+  limit   START_END       D       1       469     .       0       0       0       0
mm9_chr2_51927461_51927981_+   limit   START_END       D       1       520     .       0       0       0       0
mm9_chr2_51927461_51927981_+   site    m1      D       106     111     CACGTG  9.1     7.2e-05 -9.545  4.145
mm9_chr2_51927461_51927981_+   site    m1      R       106     111     CACGTG  9.1     7.2e-05 -9.545  4.145
mm9_chr3_18523921_18524271_+   limit   START_END       D       1       350     .       0       0       0       0
mm9_chr17_29324151_29324630_+  limit   START_END       D       1       479     .       0       0       0       0
mm9_chr5_111009921_111010390_+ limit   START_END       D       1       469     .       0       0       0       0
mm9_chr14_37727831_37728360_+  limit   START_END       D       1       529     .       0       0       0       0
mm9_chr11_77707171_77707751_+  limit   START_END       D       1       580     .       0       0       0       0
mm9_chr14_50067391_50067801_+  limit   START_END       D       1       410     .       0       0       0       0

Citation

For the underlying tool, please cite Thomas-Chollier, M., Defrance, M., Medina-Rivera, A., Sand, O., Herrmann, C., Thieffry, D. and van Helden, J. (2011). RSAT 2011: regulatory sequence analysis tools. Nucleic Acids Research doi: 10.1093/nar/gkr377.