杨周旺教授：Sparse Deep Neural Networks Through L(1,∞)-Weight Normalization----中国科学院数学与系统科学研究院

Academy of Mathematics and Systems Science, CAS
Colloquia & Seminars

Speaker:	杨周旺教授, 中国科学技术大学
Inviter:
Title:	Sparse Deep Neural Networks Through L(1,∞)-Weight Normalization
Time & Venue:	2018.10.24 15:30 N205
Abstract:	We study L_{1,\infty}-weight normalization for deep neural networks to achieve the sparse architecture. Empirical evidence suggests that inducing sparsity can relieve overfitting, and weight normalization can accelerate the algorithm convergence. In this paper, we theoretically establish the generalization error bounds for both regression and classification under the L_{1,\infty}-weight normalization. It is shown that the upper bounds are independent of the network width and sqrt(k)-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications on the usage of such weight normalization. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. We perform various experiments to validate our theory and demonstrate the effectiveness of the resulting approach.