Remote Homology Detection

Single model

splashsearch -model model 
             -train training_set 
             -seq   sequence_db
             [options]

where:
model This is the multiple PSSM model produced by running splash with the -f metaSPLASH option.
training_set This is the fasta file used to generate the model
sequence_db This is the file containing the fasta sequences to search against the model

The optional parameters are:
-html This is used to generate an annotated HTML file where individual PSSM matches are shown in color.
Default not set
Comments This option is suggested only for small files, not exceeding a few hundred matches. 
 
-pvalue p0 Only matches with a seq_pvalue <= p0 are reported.
Default 1E-5
Comments This is the standard threshold used. It is incompatible with the -evalue option. Either one or the other must be used to filter the matches
 
-evalue p1 Only matches with a db_pvalue <= p1 are reported.
Default Not Used
Comments This is the standard threshold used. It is incompatible with the -pvalue option. Either one or the other must be used to filter the matches
 
-coeff c c is the minimum score of any local PSSM match to be used in the multi-PSSM score. 
Default 3.2
Comments Changing this value is not suggested. Lower values can be used to increase the probability of identifying remote homologies. However, this increases the probability of false positives as well. Conversely, a higher value reduces false positives but it also reduces the probability of detecting remote homologies.
 
-o file_name This option is used to write results to a file rather that to standard output.
Default Not set
Comments  

 

Multi Model

splashsearch -multimodel model_file
             -seq        sequence_db
             [options]

where:
model_file This is the master file contining pointers to individual, compiled multiple PSSM model. The latter are obtained by first running splash with the -f metaSPLASH option and then splashsearch with the -build option.
sequence_db This is the file containing the fasta sequences to search against the model

The optional parameters are:
-html This is used to generate an annotated HTML file where individual PSSM matches are shown in color.
Default not set
Comments This option is suggested only for small files, not exceeding a few hundred matches. 
 
-pvalue p0 Only matches with a seq_pvalue <= p0 are reported.
Default 1E-5
Comments This is the standard threshold used. It is incompatible with the -evalue option. Either one or the other must be used to filter the matches
 
-evalue p1 Only matches with a db_pvalue <= p1 are reported.
Default Not Used
Comments This is the standard threshold used. It is incompatible with the -pvalue option. Either one or the other must be used to filter the matches
 
-coeff c c is the minimum score of any local PSSM match to be used in the multi-PSSM score. 
Default 3.2
Comments Changing this value is not suggested. Lower values can be used to increase the probability of identifying remote homologies. However, this increases the probability of false positives as well. Conversely, a higher value reduces false positives but it also reduces the probability of detecting remote homologies.
 
-o file_name This option is used to write results to a file rather that to standard output.
Default Not set
Comments  

Train

splashsearch -model model 
             -train training_set 
             -seq   sequence_db
             [options]

where:
model This is the multiple PSSM model produced by running splash with the -f metaSPLASH option.
training_set This is the fasta file used to generate the model
sequence_db This is a file containing a few hundred sequences representing the "average" database against which queries will be run. For instance, this could be a random subset of the SWISSPROT database. 

There are no optional parameters 

0. SPLASH
1. Algorithm
2. Performance
3. Pattern Discovery

Syntax
DNA/Protein Seq.
Constraints
Statistical Constr.
Similarity Matrix
Parallel Execution
Output Format
Other

4. Exhaustive Discovery

Syntax

5. Hierarchical Discovery

Syntax

6. Search

Syntax

7. References