"To do" list for R/qtl ---------------------------------------------------------------------- This file is intended to contain a list of many of the additions and revisions that are planned for the R/qtl package. If you any additions or revisions to suggest, please send an email to Karl Broman, . ---------------------------------------------------------------------- SHORT TERM: o markerlrt() adds attribute "onlylod = TRUE"; this gets stripped off by various functions and should be re-established o In plot.rfmatrix, include marker name in title (unless main is provided as an argument). o Bugs in summary.scanone in the case of multiple phenotypes and multiple thresholds for formats other than allpheno and allpeaks. I get warnings like In lod[, i] > threshold : longer object length is not a multiple of shorter object length Also, I need to use as.numeric() on the thresholds o Helper function for create.map to identify locations on the grid o function to pull out genoprobs or imputations o Use snow() with orderMarkers o Bug in replacemap.cross in case of sex-specific map and genoprob/draws maps that need interpolation o beeswarm package for plot.pxg o In sim.cross, to get a QTL on the X chromosome, you need to use chr=20 rather than chr="X". o scanPhyloQTL should give warning if there are different sets of markers. o Function to calculate recombination fraction and LOD score for just adjacent markers. o geno.crosstab seems unnecessarily slow o Function to get no. qtl in a QTL object; also no. interactions (via the formula) o Look at possible global options to see if any might interfere with things. o Saunak reported a problem in summary.scanone: Some time in the last month, the summary.scanone function does not work (for one data set) when I ask it for p-values with pvalues=TRUE. It works fine when I don't ask for the p-values. ------------------------------------------ Error in calcPermPval(peaks, perms) : NA/NaN/Inf in foreign function call (arg 4) ------------------------------------------- o checkcovar gives messed up column numbers re "Following phenotypes are not numeric" o Column names in output of addpair summary (particularly regarding the 6x15 and 7x15 scans in the hyper demo I did). o in stepwiseqtl and refineqtl: include an argument that is the distance between qtl (or number of markers between qtl) and somehow prevent qtl from getting too close [to avoid artifacts] o Bug in plot.scantwo: contours=TRUE gives warning: In any(contours) : coercing argument of type 'double' to logical Also: maybe change the color of the contour. o Add cross type "haploid" with genotype names like "A" and "B" rather than "AA" and "AB", but otherwise treat like a backcross (along the lines of "dh"). o max.scantwo could take a chr argument o Add cross type as argument to read.cross; attempt to convert data to that type. o In sim.cross, do the qtl get sorted by genomic position? And then do the qtlgeno need to be reordered back? o Effect plot on X chromosome: be sure that males and females get split properly. o replacemap when you have calc.genoprob results but from step=0; seems to give an error o Revise calc.errorlod to do things in parallel (by chromosome or individual? ... I'd guess in batches of individuals) o Use format.pval in printing pvalues, so they're never strictly 0. o In c.cross, include an argument for "flipping" a backcross, so that a backcross to (AxB)xB can be combined with an intercross o implement the "controlAcrossCol" argument within summary.scanone, to calculate p-values? o qtl x covar interactions in stepwiseqtl o scantwo: ability to scan just specific chromosome pairs o special treatment of X chromosome in scantwo permutations and in stepwiseqtl ---------------------------------------------------------------------- MEDIUM-SHORT TERM: o cim(): Haley-Knott-based approach to deal with missing genotypes at marker covariates. o cim(): rather than forward selection to a fixed number of markers, do forward selection until either a marker is not significant or the fixed maximum. o get scanone and scantwo to work with a single marker on a chromosome. o function for getting x axis location in scantwo picture (like the new xaxisloc.scanone) o In mqmsetcofactors(), we might allow cofactors() to be marker names rather than just marker indices. o In the various MQM functions, pheno.col might be a phenotype name rather than just numeric index. Also, it might be a vector of phenotype values. (Both, as in scanone etc.) o Better way to make use of ... in plot functions that allows unmentioned defaults. o method for having alternative genetic maps and for storing a physical map; linear interpolation for plotting scanone output in Mbp. o make various functions work appropriately with class "special" (eg, plot and summary) o Fix convergence problems in scanone with method="ehk", especially in the case of covariates and interactions. o Plotting scanone results for *many* phenotypes as an image plot. (Perhaps threshold the really high LODs and eliminate curves that fail to meet a given threshold.) o Allow plot.geno (or other cases in which subset.cross is used to pull out individuals) to refer to individual IDs. o Go through all of the various plot functions and make sure that the x- and y-axis labels are created with axis() rather than text() and segments(). This way, the size of the labels can be modified with par(cex.axis) o Finish off the work to get coefficient estimates by imputation in fitqtl for the X chromosome in BC and F2. o Revise c.cross so that you can combine crosses even if there are different numbers of chromosomes o 2 traits vs genotype at one QTL (with regression lines) o Documentation on RI lines. ---------------------------------------------------------------------- MEDIUM TERM: o eQTL-related stuff: - Compressing calc.genoprob results (simpler version of locations, or just the grid) - Mb positions within cross object; multiple genetic maps - Positions in phenotype object + other annotations o For the stepwiseqtl function: - should be able to save all models with plod within some value of the best model - QTL x covariate interactions in countterms - it'd be nice to be able to have all models that are one term away from the optimal model (maybe this could be a separate function) - it'd be nice to be able to easily make a plot of the final model plus information about all models that are one term away o Revise plot.rf so that it can have different color schemes, as in plot.scantwo. Allow a zscale. o Simpler methods to get at interesting bits in the est.rf results. o cM coordinates in scantwo plot for multiple chromosomes: Fix plot.scantwo for the case that incl.markers=TRUE, so that positions are not equally spaced, but are according to the genetic map. (This is working if just one chromosome is plotted, but should be made to work generally.) o scanone (or is it scantwo?) with method="hk": Major memory problems in the permutations with multiple phenotypes. o X chromosome for 4- and 8-way RIL by sibling mating (also 2-way RIL). o pull all of the genoprob and imputation information out of the qtl objects and replace it with indices to the information in the cross object. (This stuff takes up too much space!) o lodint/bayesint as option to summary.scanone. o conditional LOD scores from scantwo output. o scanqtl: if formula symmetric w/ respect to two QTL that are on the same chromosome, only scan the triangle (rather than the square) o write tools for converting the output from scanqtl() to the format for scanone() or scantwo(), according to whether it's a 1-d or 2-d search, or print a warning otherwise. o pairprob at markers + putative QTL? For linked loci in scanqtl by HK/EM. o Include Bjarke's code on eHK for scantwo. o revise read.cross with format="mm" to deal with ril by selfing and/or sib mating o read.cross for "qtx" sometimes doesn't seem to take the genotype pattern appropriately; read in a backcross as if it were an F2. o An NA in the mapmaker data file caused an error in read.cross; the line became too long. Maybe this is true whenever an item doesn't match what is expected. o Speed up read.cross.mm; deliver meaningful errors if map/genotypes don't match, and if too many genotypes in a row. o scanone with additive alleles at QTL o Pull out results for an interval from scanone. o Add additional HMM functions for the X chr in RIL, as the marginal genotype distribution is 2:1 rather than 1:1 and the transition matrix is not symmetric Pr(BB|AA) = 2r/(1+4r) and Pr(AA|BB) = 4r/(1+4r) o Add appropriate functions to analyze advanced intercrosses (AILs) and advanced backcross (BCn). o Allow phenotypes on multiple individuals (esp for recombinant inbred lines). ---------------------------------------------------------------------- LONG TERM: o covariates with 2part model in scanone o X chromosome in cim(). o Analysis of censored phenotypes. o Analysis of residuals. o Imprinting/parent-of-origin effects. o Treating a covariate as a random effect. o Multiple phenotypes (esp. regarding pleiotropy). o Take the fit of the null model outside of the C code for the imputation method in scanone and scantwo, so that it only has to be done once (rather than for each chr or chr pair). o Generalized linear models in scanone and scantwo. o Incorporate code from Brian Yandell, Fei Zou and Amy Jin on semi-parametric QTL mapping methods. o Analysis functions such as scanone and scantwo might assign an attribute to their output which identifies the input data and/or function call. o Re-write the C code for EM underneath scanone and scantwo so that it is not so tedious. ---------------------------------------------------------------------- end of TODO.txt