A strong random-forest example is a sparse signal in a higher-dimensional space: only features 1 and 2 carry the true class structure, while features 3 through d are pure nuisance variables. Random forests often beat a single deep tree here because feature subsampling reduces the chance of repeatedly splitting on spurious noise coordinates.
Validation error versus number of trees. The dashed gray line is the single deep tree baseline.
Random-forest feature importance. Ideally features 1 and 2 dominate, because they are the only informative coordinates.