Brian Daniels
Rod True
Cindy Barlow
Mary Kraft
Susan Stevenson
Teressa Erickson
Joy McCaffrey
Michael Kraft
Travis LaMont
Mark Vitaliano
Jon Anderson
Ashley Rui Li
Kelly True
Ashley Brodarick
Kevin Lobo
Sean Regan
Bill Ryan
Michael Jaskiewicz
David Landsman
Spend Radar is proud to support and raise awareness of the technology innovation in Chicago.
Current Articles | RSS Feed
I once had a professor in a data mining class say to me, "cleaning your data is 90% of the work in data mining." All the wonderful algorithms that pioneers in the data mining field have developed over the years such as association rules, k-means clustering, neural network classification, and on and on are all utterly useless until that labor up front is accomplished. Even basic statistical calculations such as mean or standard deviation are unreliable if someone in advance doesn't disambiguate all the "widgets" from the "widgits [sic]" or even worse, the "iwdgets [sic]."
Recently I've been working on developing automated methods of knowledge discovery in spend data. The advantages are obvious to working with cleansed data in spend analytics software, however there are a few that are not so apparent when simply looking at the surface that I would like to share.
Now that I've nearly completed this post, I'm going to go back and concentrate on the 10% of the work that remains that will be the most fruitful for any organization. I think I'll write some code that uses linear regression to identify seasonal spend behavior. That should only take an hour...
Michael Jaskiewicz, Senior Software Engineer, Spend Radar
Allowed tags: <a> link, <b> bold, <i> italics