Decision tree learning, used in statistics, data mining and machine learning, uses a decision tree as a predictive model which maps observations about an item to conclusions about the item's target value...
We used decision trees, specifically an extension of decision trees known as a random forest, to predict whether a project would be funded. The graphic below shows an example decision tree for the DonorsChoose data. It begins by asking whether the project is in a high poverty school, then depending on the answer, it splits the data on either the school's region, or the project's posting date. Once a project reaches the bottom of the tree, or the "leaves," the tree makes a prediction. A random forest is a combination of many trees, each constructed by splitting on a random subset of the available factors listed to the right. Random forests are robust to overfitting, bias, and noise.
The majority of funding requests get funding, so the difficulty was in finding the unfunded projects. We classified 19 percent of projects as unfunded, with only 32 percent of projects classified unfunded in error. Our model included an initial attempt at classifying the strength of essays, an important hidden variable, but the essay classifier could be much improved in future iterations. Subsequent analysis has identified additional features which should increase the strength of the model. We plan on removing the temporal features because they prevent funding prediction for future projects. In this context, date posted is essentially an instrumental variable for the site's traffic, which is relatively consistent between months. Future versions of the random forest should use site traffic for the month preceding the posting of the project in place of the date where the project was posted.

