Reassuring and Troubling Views on Graph-Based Semi-Supervised Learning


Yoshua Bengio
University of Montréal
Canada


This talk starts with positive views on graph-based semi-supervised learning. First, many of these algorithms can be approximately or exactly framed as minimizing a criterion that trades off between two objectives: similar inputs should yield to similar outputs (over all examples), and the predicted output should be close to the target output (over labeled examples). One potential problem with these approaches is that training may require cubic time. We outline an acceleration technique based on selecting a subset of landmark examples. The last topic for this talk is about doubts on generalization performance of such models when the underlying manifold is high-dimensional or has a complex and curved shape. We show that the number of required labeled examples may grow arbitrarily with the complexity of the underlying target concept, even though a simple representation of the decision surface may exist. This "curse of dimensionality" problem is shared with other kernel-based algorithms in which the kernel is local.