lightgbm的min_hessian参数 xgboost的min_child_weight参数
lightgbm的min_hessian参数有一个别名就是min_child_weight,分别看看两个参数的官方文档解释:
lightgbm的min_hessian:
minimal sum hessian in one leaf. Like min_data_in_leaf, it can be used to deal with over-fitting
xgboost的min_child_weight:
Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression task, this simply corresponds to minimum number of instances needed to be in each node. The larger min_child_weight is, the more conservative the algorithm will be.
从上面的描述可知,这两个参数是一个意思。
L''(F_m-1(xi),yi),L是损失函数,F_m-1是前m-1棵树加和,所以最小hessian和,就是隶属于某个节点的所有xi的损失函数二阶导L''(F_m-1(xi),yi)求和,特殊地,当L是最小二乘损失,那么二阶导是常数1,min_hessian就等价于min_data_leaf最小叶节点sample size,当L是log loss,那么二阶导就是pi(1-pi),即σ(F_m-1(xi))[1-σ(F_m-1(xi))]
当L是log loss,为什么二阶导是pi(1-pi),参考https://stats.stackexchange.com/questions/231220/how-to-compute-the-gradient-and-hessian-of-logarithmic-loss-question-is-based
和
留言
張貼留言