I was helping my brother with a script called incith-google used by one of his IRC bots. It acts as a bridge between IRC and google, allowing IRC users to perform queries against google from the comfort of their IRC client. The script performs the search on the IRC user’s behalf and returns the result back into the IRC channel.
The problem is that it had broken. In investigating, it became readily apparent that it was bound to break, and surely had broken before. This is because the script was making the request to google as if it were a web browser, and receiving html output in return. The IRC client only wants to see a short bit of text, so the script attempts to parse out the juicy bits from the html output. This process of screen scraping is wobbly at best; any subtle formating or presentation change of the google search results web page can break the script entirely. This happens frequently due to the web’s inherent tendency to mix content and presentation semantics; the good news is that the w3c is finally catching on to this.
A more robust way to do these types of interactions would be to use a more rigorous and starndardized process for asking questions and getting answers; in a word: API. A large part of the value of API is that the semantics of exchanging information and interacting witih other pieces of software are ‘locked down’, in the sense that the API vendor wants you to trust that those semantics will continue to work as designed, for the life of the API. It just so happens that Google has several APIs for accessing their various services. Because Eggdrop scripting is done solely in TCL, I started there. It didn’t take long to find Web Services For TCL, which is precisely what I needed.One downside to the Web Services for Tcl library is that there are a significant number of other (mostly non-standard) Tcl libraries on which it is dependant. Tcl has no package / module / library management system that might ease the process of installing these other libraries, so it took me a bit of time to get it all going (and I mostly know what I’m doing). The average Eggdrop user learned unix in order to utilize Eggdrop itself, so they are typically not of the sysadmin variety (as it happens, Eggdrop is what initially got me in to Unix, though I’ve come a bit of a ways since 1995 or so).
A bigger downside is that Google no longer allocates new SOAP API keys, so if you didn’t get one prior to Dec 5, 2006, you are s.o.l. I got one for some reason, even though I’m only really getting around to using it now.
Anyway, after going through all the trouble of getting this library operational, I figured I’d go ahead and bang out a quick eggdrop interface to Google based on Web Services for Tcl, so there you have it.