MPtools Molecular Phenotypic Tools

MPtools, also known as Molecular Phenotypic Tools, is a specialized tool for molecular phenotypic data processing. It includes functionalities such as data processing, filtering, correction, and molecular pedigree construction. This software is developed based on C++.

MPtools Features:

1. Extract/Delete Specific Molecular Phenotypes or Samples

--extract <ID file> Extract specified molecular phenotype ID file

--exclude <ID file> Delete specified molecular phenotype ID file

--keep <ID file> Keep specified sample ID file

--remove <ID file> Remove specified sample ID file

2. Non-zero Proportion Filtering

--pheno-rate <value> Molecular phenotypes with more than the threshold proportion of zeros among multiple individuals will be excluded

--sample-rate <value> Individuals with more than the threshold proportion of zeros in multiple molecular phenotype values will be excluded

3. Filtering Molecular Phenotypes Based on Coefficient of Variation

--CV-value <value> Phenotypes with low coefficient of variation (e.g., reference genes) need to be filtered, and those with coefficients above the threshold will be retained

--CV-top <value> Phenotypes in the top coefficient of variation ranking will be retained

4. Proportion Type Correction for Molecular Phenotypes (Microbiome)

--count Convert count data to proportion data,

--clr Center log-ratio transformation for molecular phenotypes across samples

--min-frac <value> Replace zeros with the minimum molecular phenotype multiplied by each sample's smallest phenotype

--GBM <value> Use geometric Bayesian filling for zeros

5. Non-proportion Type Correction for Molecular Phenotypes (Transcriptome, Proteome, Metabolome)

--quantile-norm <value> Quantile normalization

--inverse-norm <value> Inverse normal transformation

6. Construct Molecular Phenotype Kinship Matrix

--kinship Construct a kinship matrix for molecular phenotypes

--kin-lambda <value> Power correction of variance, default is 1

--kin-bin <value> Generate binary kinship matrix

More about MPtools

Developer: Cai Wentao

Contact: caiwentao@caas.cn

MPtools Software Development

MPtools V1.0.0 is written in C++ and can be used for specifying the extraction and deletion of molecular phenotypes or samples, filtering molecular phenotypes and samples with excessively high zero values, filtering out molecular phenotypes with a coefficient of variation that is too small, correcting proportional molecular phenotypes (microbiome), correcting non-proportional molecular phenotypes (transcriptome, proteome, metabolome), and constructing a kinship matrix for molecular phenotypes, which is used for estimating genetic parameters and calculating breeding values in the downstream IASbreeding software.

Installing MPtools

You can directly use it after extraction:

tar -xzvf MPtools-1.0.0-Linux-x86_64.tar.gz
./MPtools 

Using MPtools

  • Specify Extraction and Deletion of Molecular Phenotypes or Samples
Extract specific genes
./MPtools --txtfile Muscle.tpm.autosome.txt --extract extract.id --out Result
Exclude specific genes
./MPtools --txtfile Muscle.tpm.autosome.txt --exclude exclude.id --out Result
Keep specific samples
./MPtools --txtfile Muscle.tpm.autosome.txt --keep keep.id --out Result
Remove specific samples
./MPtools --txtfile Muscle.tpm.autosome.txt --remove remove.id --out Result
Support combinations, such as extracting specific genes from specific samples:
./MPtools --txtfile Muscle.tpm.autosome.txt --keep keep.id --extract extract.id --out Result

  • Filtering Molecular Phenotypes with a Coefficient of Variation that is Too Small
Extract genes in the top 10% of the coefficient of variation:
./MPtools --txtfile Muscle.tpm.autosome.txt --CV-top 10 --out Result
Extract genes with a coefficient of variation greater than 0.5:
./MPtools --txtfile Muscle.tpm.autosome.txt --CV-value 0.5 --out Result
  • Filtering Molecular Phenotypes and Samples with Excessively High Zero Values
Retain molecular phenotypes that are expressed (>0) in at least 15% of individuals:
./MPtools --txtfile Muscle.tpm.autosome.txt --pheno-rate 0.15 --out Result
Retain samples where at least 85% of the molecular phenotypes are expressed (>0):
./MPtools --txtfile Muscle.tpm.autosome.txt --sample-rate 0.85 --out Result
  • Correction of Non-Proportional Molecular Phenotypes (e.g., Transcriptome, Proteome, Metabolome)
Quantile normalization:
./MPtools --txtfile Muscle.tpm.autosome.txt --quantile-norm --out Result
Inverse normal transformation:
./MPtools --txtfile Muscle.tpm.autosome.txt --inverse-norm --out Result
Support various combinations, such as extracting specific genes from specific samples, while retaining at least 15% of individuals expressing (>0) and the top 95% of genes with a high coefficient of variation, followed by inverse normal transformation:
./MPtools --txtfile Muscle.tpm.autosome.txt --keep keep.id --extract extract.id --pheno-rate 0.15 --CV-top 95 --quantile-norm --inverse-norm --out Result
Correction of Proportional Molecular Phenotypes (suitable for Microbiome)

  • Correction of Proportional Molecular Phenotypes (suitable for Microbiome)
Convert count data to proportion data:
./MPtools --txtfile 1044otutab.filter.txt --count --out wwwdd
Center log-ratio transformation:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --clr --out Result
Perform geometric Bayesian filling for zeros and then apply center log-ratio transformation:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --GBM --clr --out Correct

  • Constructing a Kinship Matrix for Molecular Phenotypes
Can be done in one step:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --GBM --clr --kinship --out kinship
Alternatively, correct first, then use --kinship:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --GBM --clr --out Correct
./MPtools --txtfile Correct.txt --kinship --out kinship
You can increase or decrease the variance weight of each molecular phenotype using --kin-lambda:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --GBM --clr --kinship --kin-lambda 1.1 --out kinship
Generate a binary matrix:
./MPtools --txtfile 1044otutab.filter.txt --pheno-rate 0.15 --count --GBM --clr --kinship --kin-lambda 1.1 --kin-bin --out kinship