May 17, 2009
Mining Software Repositories
I’ve spent the weekend at MSR. A lot of the papers yesterday were very meta, talking about platforms and strategies, which was strange, as I came into the workshop without much background.
I’ve had a few conversations about what work has been done in mining software repositories, and I’m now quite optimistic about what we can discover from the Hadley source code that would be useful and interesting. There’s a poster about identifying tightly coupled bits of code in an industrial codebase, that I think would be a good starting point for looking at the Hadley code. I have an idea that it should be possible to do this same state of the art coupling analysis, and see how much the tightly coupled units correspond to tightly coupled physical processes.
Just in general, looking at how much of some work done has been on large open source repositories available online, I wonder how much the results can be generalized. I’m willing to be convinced, but I’m not yet convinced. I don’t know if the Linux kernel is developed in a representative way. I know people who contribute, and much of their contributions are small driver or hardware compatibility related fixes. I wonder how that compares to other projects…