演講者簡介 : Professor Chien-Ming Chi received his Ph.D. in Economics from National Taiwan University in 2020. He is currently an Assistant Research Fellow at the Institute of Statistical Science, Academia Sinica. His research interests are High-dimensional Nonparametric Prediction, High-dimensional Nonparametric Inference, and Time Series Analysis.
演講摘要 : Orthogonal-split trees perform well, but evidence suggests oblique splits can enhance their performance. This paper explores optimizing high-dimensional s-sparse oblique splits from {( ω ⃗,〖ω ⃗^⊤ X〗_i ):i∈{1,...,n},ω ⃗∈R^P,‖ω ⃗ ‖_2=1,‖ω ⃗ ‖_0≤s} for growing oblique trees, where s is a user-defined sparsity parameter. We establish a connection between SID convergence and s0-sparse oblique splits with s0 ≥ 1, showing that the SID function class expands as s0 increases, enabling the capture of more complex data-generating functions such as the s0-dimensional XOR function. Thus, s0 represents the unknown potential complexity of the underlying data-generating function. Learning these complex functions requires an s-sparse oblique tree with s ≥ s0 and greater computational resources. This highlights a trade-off between statistical accuracy, governed by the SID function class size depending on s0, and computational cost. In contrast, previous studies have explored the problem of SID convergence using orthogonal splits with s0 = s = 1, where runtime was less critical. Additionally, we introduce a practical framework for oblique trees that integrates optimized oblique splits alongside orthogonal splits into random forests. The proposed approach is assessed through simulations and real-data experiments, comparing its performance against various oblique tree models.