Calculating Feature Importance in Data Streams with Concept Drift using Online Random Forest

with No Comments

I had the privilege of presenting my work on “Calculating Feature Importance in Data Streams with Concept Drift using Online Random Forest” at IEEE Big 2014 in Washington, DC this last week. The conference was an interesting mix of presentations … Read More

Which Armstrong?

with No Comments

In my last post, I described how we used Elias, an exploratory analysis tool for large-scale information extractions, to look at which (person,location) pairs are mentioned the most together, and then extended the analysis to distinguish how those entities are comentioned. Today, … Read More

Stochastic Gradient Descent

with No Comments

Most machine learning algorithms and statistical inference techniques operate on the entire dataset.  Think of ordinary least squares regression or estimating generalized linear models.  The minimization step of these algorithms is either performed in place in the case of OLS … Read More

Destructuring in Mathematica

with 1 Comment

A technique that I have particularly useful in Lisp-like languages like Mathematica and Clojure is destructuring. Destructuring is a mechanism for extracting parts of an expression. The Lisp “code as data” paradigm lends itself to destructuring techniques. I recently leveraged … Read More

1 2