| Hierarchical Motif Discovery | ||||
WARNING: This mode is still experimental and my end in an infinite loop. Changing the parameters, e.g., increasing -l or the minClSize usually eliminates the problem. This will be fixed in a later revision. Splash can be used to exhaustively analyze a sequence database and produce a tree of sequence clusters based on discovered patterns. Results are Reported as a series of HTML files that can be browsed by opening the file: InputFile_Browser.html This is compatible with IE4.0 and higher only. It can be opened in Netscape Navigator but in that case the tree is not dynamically updated by user clicks. The approach uses the pattern discovery algorithm in a loop. First pattern discovery is run using the standard discovery options. If a pattern is not found, first the density constraint is reduced according to a user defined formula, then the minimum support is reduced by 5%, until a pattern is found or a minimum, user defined support is reached. The most statistically significant pattern among all those found is used to generate a PSSM, which is used to split the original sequence set into two sets a "match set" and a "mismatch set". To do that, the PSSM is then matched against all the sequences in the set. Three outcome are possible: 1) The sequence is a strong match. In that case it is added to the match set 2) The sequence is a strong mismatch. In that case it is added to the mismatch set 3) The sequence is not a strong match nor a strong mismatch. In that case, the sequence is added to both sets The pattern is then masked anywhere it occurs in the match set and the discovery is repeated recursively both on the match set and on the mismatch set. This produces a hierarchical tree. The procedure stops when either a set has fewer than a user defined minClSize sequences, or when the minimum support drops below a user defined threshold and no pattern is yet found. This approach is illustrated by the following block diagram: |
0. SPLASH
Syntax 6. Search 7. References |
|||
![]() |
||||
![]()