Structure Pipeline

Users of Structure (Pritchard et al, 2000) may be familiar with the interface of the BioHPC cluster at Cornell. Unfortunately, guest access was discontinued in May, 2011. If you have structure installed on your SGE supercomuting cluster, several features of the web-based BioHPC cluster interface can be replaced with a pipeline of qsub and python scripts. This pipleline will guide you through setting up your datafile and parameter settings, running structure efficiently at many values of K, summarizing those results usingĀ CLUMPP, and vizualizing the results using custom R scripts:

  • popgen_parse.py: Parses a tab-delimited GenAlEx-like format into structure (and other) formats.
  • structure.csh: For each value of K, runs structure multiple times using an SGE array job.
  • simsum.py: Creates the simulation summary, useful for doing post-hoc tests like deltak.
  • structure2clumpp.py: Extracts the Q matrix from each run at a certain value of K, for use in CLUMPP.
  • structureplot.R: R script for making the familiar structure Barplot.
  • deltak.R: Calculates the best K using the post-hoc deltaK method (Evanno et al. 2007).

To download the scripts, please visit my github repository!

One Comment

  1. Pingback: Genetic erosion of Platanus racemosa by hybridization - mossmatters

Leave a Reply

Your email address will not be published. Required fields are marked *