Dataset construction can involve any number of different steps depending on what kind of dataset is being constructed. So this would include things like concatenating multiple sequence alignments for a multigene dataset or it could include more extensive things like orthology and paralogy assignment.
Here we will look at concatenation, orthology and clustering, and some simple paralogy analyses.