Sunday, April 21, 2013

Source code for the ASD based TMRCA calculator (Octave)

The code for the TMRCA calculator of YDNA STR haplotypes that I use can be downloaded from here :

See also here for instances of where I have used the calculator in the past:

The code is written for Octave and is also Matlab compatible. There is also an instruction file that explains how to run the calculator in the folder that is linked above which can also be found below:

To check if the TMRCA program is correctly working on your system, first run it with the dataset
provided here before trying different datasets, to do so:

(1) Make sure you have Octave loaded on your system (either Windows or Linux will work) and start octave in the command line.
(2) In the command line, change your working directory to the directory where you saved the unzipped  folder by using: cd ~PATH/TMRCA_ASD/
If you are unsure of your current working directory, type the command: pwd()
(3) Type: fcompositeTMRCA("Buckova_EM78","all")
(4) If this produces results, then the program and functions are correctly installed and you can proceed to reading and analysing different datasets.

Reading and analysing new Data

After correctly executing the above steps, read and analyse new data by using the following steps:
(1)open the example STR data file in the "TMRCA_ASD/Loaded_Data/" folder entitled "EM35_STR.xls"
(2)Any STR data file to be analysed should first be made in the same format as the "EM35_STR.xls" file , specifically:
(a) DYS names in the first row should have the exact same nomenclature (the orders can be different however).
(b) Each row (except the first) should represent one sample.
(c) Each coloumn (except the first) should represent repeats for one marker/DYS#.
(d) The first column should represent sample identifiers, ex. Kit#, sample ID,...
(e) The cell found in the first row and first column should have the Dataset's name, this will be the same   name used throughout the analysis.
(f) No cell shall contain null values and avoid having cells that contain characters which have spaces in between them.
(3) In Excel or openoffice, convert the "EM35_STR.xls" workbook to a ".csv" file by saving the file as "YSTR.csv" and placed into the
same "TMRCA_ASD/Loaded_Data/" folder. The program will only look for a file entitled "YSTR.csv", so make sure that the same name is used for your file.
(4) Start octave, in the command line, change the working directory to "~PATH/TMRCA_ASD/Loaded_Data/"
(5) Type on the octave prompt: readdata
(6) Octave will start reading the dataset and create the file "EM35-Balanced" in the folder "/TMRCA_ASD/Loaded_Data/" when it is finished.
(7) If you want to analyse a specific set of markers from your dataset go to setep 8, otherwise go to step 9
(8) Go to the file "/TMRCA_ASD/Markerlist/49markerlist.txt", and pick the markers you want to use for the analysis. Then save your chosen
markers into a new txt file in the same folder as "/TMRCA_ASD/Markerlist/". Take a look at the file "8_Chiaronimarkerlist.txt" for
an example of how the marker list should look.
(9) In octave, change your working directory back up one level by typing: cd ..
(10) If you are specifying a set of markers to use in the analysis, then run the program by typing: fcompositeTMRCA("EM35-Balanced","8_Chiaronimarkerlist.txt"), otherwise, just type: fcompositeTMRCA("EM35-Balanced","all").
Update : Version2 -  *.CSV read, + Auto path detect. (fcompositeTMRCA.m, fmarkerextract.m, readdata.m)
Update(04/25/13) : Version3 - Add option for using all available markers, print used/unused markers. (fcompositeTMRCA.m, fmarkerextract.m, fAssignmutation.m)

No comments:

Post a Comment