Skip to content
2000
Volume 17, Issue 4
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Aims: This study aims to formulate the inter-feature correlation as the engineered features. Background: Modern biotechnologies tend to generate a huge number of characteristics of a sample, while an OMIC dataset usually has a few dozens or hundreds of samples due to the high costs of generating the OMIC data. Therefore, many bio-OMIC studies assumed inter-feature independence and selected a feature with a high phenotype association. Objective: Many features are closely associated with each other due to their physical or functional interactions, which may be utilized as a new view of features. Methods: This study proposed a feature engineering algorithm based on the correlation coefficients (FeCO) by utilizing the correlations between a given sample and a few reference samples. A comprehensive evaluation was carried out for the proposed FeCO network features using 24 bio-OMIC datasets. Results: The experimental data suggested that the newly calculated FeCO network features tended to achieve better classification performances than the original features, using the same popular feature selection and classification algorithms. The FeCO network features were also consistently supported by the literature. FeCO was utilized to investigate the high-order engineered biomarkers of breast cancer and detected the PBX2 gene (Pre-B-Cell Leukemia Transcription Factor 2) as one of the candidate breast cancer biomarkers. Although the two methylated residues cg14851325 (P-value = 8.06e-2) and cg16602460 (Pvalue = 1.19e-1) within PBX2 did not have a statistically significant association with breast cancers, the high-order inter-feature correlations showed a significant association with breast cancers. Conclusion: The proposed FeCO network features calculated the high-order inter-feature correlations as novel features and may facilitate the investigations of complex diseases from this new perspective. The source code is available on FigShare at 10.6084/m9.figshare.13550051 or the web site http://www.healthinformaticslab.org/supp/.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893617666220124123303
2022-05-01
2025-06-01
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893617666220124123303
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test