June 29, 2009
I’m really not good at updating this blog, so I’m going to make myself do a quick update.
Last week, I downloaded the CCSM and NCAR climate models, and spent a little time reading them, and a couple papers describing them. I also spent a day setting up version control. I don’t like GIT that much, but I can live with it.
This week, I’m going to continue looking through the models, partly just to get a better grip on FORTRAN. I’m going to start writing some code to try some of the “source code searching” and other natural language clustering techniques, and try to run them on a subset of the files. I guess I’ll start with LSI, and work through the different approaches which have had success.
Jon mentioned that we may be able to use the pfortran parser on its own, which would make it much easier to do the static analysis based clustering. I don’t yet have too much of a plan for this, but hopefully, I will soon.
June 10, 2009
This is kind of a meta-post, a to-do list for the next week or so. What do I need to get a handle on before I can start looking through the Hadley centre data (Whenever that will be available – hopefully soon. The practical part of my brain is getting twitchy.) Each part needs links to papers, and better summaries.
Code ownership: The summer students (Sarah and Ainsley) are trying to create a social network graph, or to suggest experts, based on code ownership. I should look more into the pitfalls that some of the poster presenters told me about, artifacts of the software development process. In particular, there’s always the configuration files, and build scripts, that would need special treatment.
Similarly, the bug assignment and expertise finder research includes a few different ideas of what code ownership or expertise looks like. There’s the number of changes, number of lines of changes metrics for ascertaining expertise, and a few others.
Tightly Coupled Code: This comes down to clustering. There are a few different approaches, from ignoring source code and clustering based on check in time, check in comments, etc, to clustering based entirely on a source code searching technique (LSI or other statistical methods, some dictionary-based methods may be less applicable to code written in FORTRAN). There’s also static analysis based methods, which mean I want to evaluate all static analysis tools available for FORTRAN, and their applicability.
Of course, there’s also the configuration settings and their relationship to changes, as well as data files. Does linking data files to the configuration setting checked in when the data files were first checked in help? Would mining run records help?
Comparing the clusters formed by different methods would be both a validating technique, and a way to potentially identify smaller clusters of files.
Personally, I like the idea of using a smaller unit of analysis than a file, maybe a line.
For displaying this coupling, I like something like software cartography (here) that I talked to Adrian Kuhn about. I can almost see how to extend the visualization for different clustering techniques, maybe.
Traceability from experiment to code to published results: I’m still grasping about for appropriate prior results on this one, but one thing I’d like to do is see if the tightly coupled clusters are associated with _something_ in the experimental world.
OK, that’s enough. I’ll try to tie the interesting papers and posters in soon.
June 5, 2009
I got back from a far too relaxing week in BC, mostly diving near Nanaimo, and was ready to get back to work. Unfortunately, my two and a half year old tablet was not – halfway through ICSE, it decided to quit. The battery showed signs of excessive heat, as did the motherboard, and the wireless was burnt out. Combine that with a generally banged up case I was planning to replace, and a keyboard that my cats have stepped on one too many times (you could see the bend when I took it off), and the tablet is fixable, but at a cost greater than that of a new laptop. I’m keeping my eyes open for spare parts, though. I resurrected my previous laptop with some soldering and an automotive power connector. And duct tape. Kept it going another year or two like that. I don’t give up on electronics easily.
So, I’ve got a new home and transit machine, an asus eee pc 1008HA. It amusingly comes with an energy star sticker, whatever that means for a laptop. It came with windows XP, and ran relatively well with that. Me being me, I have configured it as a dual boot windows 7 release candidate and debian linux machine.
The wireless driver for windows 7 is picky, and I needed to change the router settings to get it to work. There’s a bizarre bug in which the screen goes blank right after the boot, but behaves again if I sleep the machine and wake it up.
The linux install was pretty straightforward, except the actual wired network adapter doesn’t have a driver that works – hardware’s too new. That will come, though, or I could probably fix it myself it I get desperate. Wireless just works, though I had to install the unstable release to get the updated driver.
I don’t have any data loss. I obsessively back up – I set up a little scp script to put things onto my university account, where everything is backed up, and have an external usb drive full of weekly-ish images.
I did spend a lot of time configuring and installing. I don’t know any way to get around that, especially if the hardware and OS change. It made me ponder a bit the whole “reproducible research” thing, how confident I would be that a program run on this new setup would have the same results as one run on my old one.
May 19, 2009
Conversation going on about how to share knowledge between stakeholders who are not (software) engineers. This, together with the idea of wanting more specialization in software engineering, makes me wonder if the future that we want to talk about is one where everyone is a programmer, and a specialist in some other area. I think that when the organization at my old job working on train control systems worked, it was because in some sense everyone was more of a specialist in subway control systems, and moved between software, hardware, systems, and safety. Like Greg says about the scientists, the key knowledge is the domain, the other things can be learned more easily.
I don’t know if I’m interested in this specifically as a research area, but I am coming to think that my assumptions in this area influence what I think is interesting research in supposedly unrelated areas. Assumptions about people and what they’re like do underly the more mathematical or technical papers, I think.
I tried to talk with as many people from MSR about the Hadley centre codebase, and what might be possible. I’m encouraged, since it seems like anything I can do with respect to mining the version control system, even if it’s recreating another result, would be interesting.
Subversion is probably the VCS that the most work has been done on, and there’s a reasonable amount of work on identifying closely coupled modules. One fellow I need to look up presented a poster (with short paper) on a case study of applying the state of the art technology to an industrial codebase (in Java). He told me a little about the things I should look for in Subversion, and that I should really understand the workflow to make good conclusions. I kind of understand that work is done on branches and then merge at Hadley, which would make things easiest, but I’ll need to verify that.
Some work has done static analysis to identify coupling, which might work with fortran. It wouldn’t capture connections with data files or with configuration options, though. Some of the source code searching could be useful, particularly for the connections between code and data files and configuration file options.
If this is at all successful, I think that a recommended list of files, functions, data files, or configurations to consider with a proposed change might be immediately useful. If there’s some sort of meta-tagging in the VCS, it could be both improved and validated based on some very simple feedback or logging mechanisms.
In a very long-term kind of way, this can tie into deciding how “necessary” the couplings are. I think if we tie this to the idea of mapping the code back to physical processes, then we can see how many clusters of highly coupled (say, as in changed together often) files tie back to related physical processes. It might be possible to say something about what the clusters correspond to, and if any of them are “historical accidents” as opposed to proceses which are necessarily related. Or Conway’s Law kind of effects – to mention to workshop from today. It might be that some of the coupling groups are related to a key person’s area of experise, or related areas of climate science that do not describe tightly coupled processes (areas where maybe there should be some sort of pseudo-inheritance? Maybe?)
To tie the code back to formulae and results, Greg has suggested provenance tools. This I’ll need to sort out more.
Specifics all to come – this is more of a “To Do” list.
May 18, 2009
I chatted with Jon about my “What is done?” ideas. He was thinking about studying the quality of scientific modeling software, and in the conversation, I came to think that how a person, team, or organization decides if software for “release” (check in to the mainline, to be given to testers, customers, used, whatever is relevant) tells you how they really measure software quality, and what they think software quality is. Whatever means someone uses to decide software is ready to pass on is their real quality metric. It’s not the boxes that are there to be checked off, it’s how you decide if you can check the box yet.
There are a lot of approaches to this, and a lot of ways to break down this question. I guess the first one I should answer is “Why should anyone care?” I’m still working on this, but I am a little inspired by a sign I saw doing some work for my father-in-law at a dairy. It said something like “Every employee has the authority to shut down a line because of leakers”. Who in a small software development group has the authority to shut down the release, to declare it “not done”? And how and why do they make that decision? Basically, do the de facto ways this decision is made agree with what people might say…
May 17, 2009
Went for a walk with some other conference attendees. I am trying to keep track of everything that anyone suggested I look into or that they would like to know about climate models.
I also chatted a little bit about a question I haven’t been able to leave behind – when is software “done”? I always have wondered how that decision is made. Everyone makes the “the money runs out” joke when I ask, but I really want to know. In particular, I wonder how people or teams learn to make the “done” decision, and how this differs between domains, teams, and people.
Last couple MSR papers now, the developer focused ones.
From talking to the MSR attendees about their research, the big problem that seems to come up is knowing how developers use their version control system, and also what information is captured by version control systems. I’m going to need to look into Subversion, and see what useful data is just lost.
I’ve spent the weekend at MSR. A lot of the papers yesterday were very meta, talking about platforms and strategies, which was strange, as I came into the workshop without much background.
I’ve had a few conversations about what work has been done in mining software repositories, and I’m now quite optimistic about what we can discover from the Hadley source code that would be useful and interesting. There’s a poster about identifying tightly coupled bits of code in an industrial codebase, that I think would be a good starting point for looking at the Hadley code. I have an idea that it should be possible to do this same state of the art coupling analysis, and see how much the tightly coupled units correspond to tightly coupled physical processes.
Just in general, looking at how much of some work done has been on large open source repositories available online, I wonder how much the results can be generalized. I’m willing to be convinced, but I’m not yet convinced. I don’t know if the Linux kernel is developed in a representative way. I know people who contribute, and much of their contributions are small driver or hardware compatibility related fixes. I wonder how that compares to other projects…
March 31, 2009
After our weekly climate change brainstorming meeting, I have a couple thoughts about topics that would interest me as a PhD topic.
Inherent coupling in climate models: Climate models involve tightly coupled physical processes. How much is it possible to decouple the modules of a climate model?
Is it possible or useful to reverse-engineer requirements for a climate model based on its change history?
I’m sure there’s more – we haven’t really got to the verification/validation problems, and I’ve always been interested in testing. It’s nice to have a couple ideas I’d be happy working on, though. It helps me feel a little less like I’m drifting aimlessly, reading papers and trying to put ideas together.
I’m hoping to use my required courses to learn more about things I’d like to know. Right now I have:
Ontologies – I’m sure there’s a breadth course in that area that would be of interest.
Statistics – Just useful, and you can never understand statistics too well.
Numerical analysis/computation – certainly possibly to take for breadth.
???? – I’m sure I’m missing something. If not, well, I can always take a theory course. A little math is fun.