Q: What is YFull's STR Variants Methodology?

A: In the STR variants report, YFull makes a prediction of how your STRs match your SNP mutations. The report is available from the menu on your YFull account home page. When you select this menu item, the next page allows you to select a sample (if there is more than one in your account) and the method of displaying the results (All, 111, 67, 37, 25, 12). Select the Submit button to call for your report. You may download a .CSV file of your STR variants.

YFull's ability to predict a relationship between a specific SNP (such as a SNP used as a subclade name or a SNP labelled as a "private mutation") and a specific derived value of a STR is dependent on a number of factors, including the size of the YFull database of YFxxxxxx samples, the number of samples having a specified STR derived value, the coverage and quality of the samples, the fact that back mutations cannot be identified and the fact that some STRs are of low quality (such as palindromic STRs like DYS385 and DYS464) and cannot be analyzed.

The variants report for a sample may include multiple derived values for the same STR, with each value related to a different subclade (SNP). The source of the multiple values is the YFull reconstruction of the mutation history for the STRs shown in the report. The reconstruction starts with the single instance of each STR found in the sample's raw data.

Each STR in the report has an ancestral (ANC) value and a derived (DER) value.

To determine the ancestral value for each STR, YFull used the maximum parsimony algorithm for a given phylogeny, as discussed in "Toward defining the course of evolution: minimum change for a specified tree topology," by W. M. Fitch (1971), published in 20 (4) Systematic Zoology, pp. 406-416. [Note: YFull believes that this maximum parsimony algorithm may hide many STR mutations (forward and backward) in the ancient haplogroups (those older than 20,000 ybp).] [Note also that the approach chosen by YFull is different from that used by some testing companies, where a modal value method is used.]

Derived simply means any value that is different from the ancestral value.

The variants report indicates which STRs are shared with other samples (naming them by YFxxxxx id) and it uses a blue "Detected" bar to show the percentage of samples at a specific subclade level in which the STR has been found. You may hover your pointer over the blue bar to obtain an exact percentage. The .CSV report uses only the percentage and not the blue bar. More information about a STR may be obtained by selecting the magnifying glass icon for the STR.

For each STR in the variants report the mutation rate is indicated by using a sliding scale of two to five stars. Two stars means frequent mutation and five stars means that mutation does not occur often. As the number of stars increases, the frequency of mutation decreases. The STR star scale is not the same as the star scale used for SNP quality. In the .CSV report, the mutation rate is only numerical (2, 3, 4, 5).

At present YFull does not offer a research tool allowing the comparison of the STR variants of two or more samples. However, comparative STR variants may be studied by using the Y-Results>View Y-STRs feature in Groups, but this feature only includes the samples of customers who have voluntarily joined one or more Groups.

It is possible that the "signature" for a sample provided by the predicted linkage of STRs to SNP subclades may lead to new testing approaches for screening potential relatives or cousins. At this point, however, there are no known "success stories."

Last updated March 28, 2018.