stackPACKTM Conventions
Naming conventions
ln#
Clonelink accession number.
During the stack_Link step of the clustering pipeline, the program assigns internal clonelink numbers to each of the clonelinks produced, starting at ln1 (link 1).
cl#
StackPACK cluster accession number.
Cluster groupings created in the stack_Cluster step (d2_cluster algorithm).
During the clustering pipeline, the program assigns internal cluster numbers to each of the clusters produced, starting at cl1 (cluster 1).
Cluster accession numbers are assigned sequentially per project.
ct#
stackPACK contig accession number.
A cluster may be split into one or more contigs by the stack_Assemble step (PHRAP algorithm).
The program assigns internal contig numbers to the contigs produced, starting at ct1 (contig 1).
Contig numbers and cluster numbers are stored independently in the database. Some clusters will have more than one contig whereas others will have only one.
Thus, the contig number thus does not necessarily correspond with that cluster number.
cn#
StackPACK final consensus accession number.
A contig may have more than one consensus sequence to represent alternate expression forms.
Final consensus sequences are generated by the stack_Analysis step.
The program assigns internal consensus numbers to each of the contig consensus sequences, starting at cn1 (consensus 1).
Consensus numbers are assigned sequentially per project.
CRAWID
The CRAW accession number given to consensus sequences in the Alignment Analysis View.
These are assigned when the CRAW algorithm is run as part of the stack_Analysis step.
Corresponds to the subalignment numbers in the Alignment Analysis View.
CRAW accession numbers are assigned sequentially per contig.
Other conventions
-XXXXXXXXXXXXXXXXXXXX-
Clonelinked cluster linker region.
The clonelinked consensus sequence is created by concatenating the primary consensus sequence from each of the
constituent clusters, separating each with a linker region of 20 consecutive Xs.
-xxxxxx-
Masked sequence data.
Sequences are masked by replacing the contaminated portions of the sequence with lower case x's.
These masked regions are retained during the clustering pipeline and are visible in the Sequence View,
PHRAP Alignment View and CRAW Alignment View.
If all the constituent EST or mRNA sequences in a particular region of the CRAW alignment are masked, the resultant consensus will be noted with lower case 'n'.
-nnnnnnnnnn-
Truncated poly-n regions in consensus sequences.
Consensus sequences may have long strings of 'n' due to underlying sequence data defined by 'n' or because all constituent sequences in the region contained masked data.
Strings of 'n' greater than 10 bases long are truncated to 10 'n's when the final consensus sequence is calculated.
Mouse over
Placing the mouse over the icons in the left panel will provide icon descriptions.
|