2021年9月，我院数据科学系熊巍副教授、博士生潘晗联合在国际重要期刊《Statistics in Medicine》上发表论文《Interaction screening for high-dimensional heterogeneous data via robust hybrid metrics》。
A novel model-free interaction screening approach called the hybrid metrics is introduced for high-dimensional heterogeneous data analysis. The metrics established based on the variation of conditional joint distribution function are measurements of interaction that include both size and direction. They are robust and can work with many types of response variables, including continuous, discrete, and categorical variables. We can apply the hybrid metrics to effective interaction selection for classification, response index models, and Poisson regression, among others. When dealing with classification, the hybrid metrics are capable of capturing both nonlinear category-general and category-specific interaction effects, providing us with a comprehensive overview and precise discovery of category information. When faced with a continuous response, the hybrid metrics perform fairly well even if the signal strength is weak, behaving as if the true interactions were known. To facilitate implementation, a fast two-stage procedure which naturally and efficiently enforces both strong and weak heredity is advocated. We further demonstrate their superior performances over popular competitors by exhaustive simulations and a SRBCT real data example. Supplementary materials for this article are available online.