Two days ago, flickr was updated. The update slightly changed the formatting of pages on flickr. For almost everything on fastr, that doesn't matter because it's using the flickr API. But the flickr API doesn't provide a way to get a list of tags used by a group, so that part is taken straight from the HTML on flickr. When that HTML changed, fastr groups broke. It didn't break right away because the tags are only updated once a day. But eventually, none of the group games were working. I just fixed that. Thanks to 'j/gimmeacookie' for pointing out this bug. Or at least I think that's the bug that was pointed out. Maybe that was a different bug?

If you see something wrong with fastr, please report it either by emailing me or by posting a comment on any relevant blog post, and provide as much detail as you can, e.g. which browser you're using, which game you're playing, what exactly happened. As much as I'd like to, I can't play fastr 24/7 on a wide variety of browsers, so these reports are very useful in finding and fixing problems. Thanks, and sorry about the brokenness.

 

Dare Obasanjo writes on screen scraping, It seems Richard Macmanus has missed the point. The issue isn't depending on a third party site for data. The problem is depending on screen scraping their HTML webpage. An API is a service contract which is unlikely to be broken without warning. A web page can change depending on the whims of the web master or graphic designer behind the site.

I completely agree that screen scraping is an undesirable practice, but I think it's actually Dare who is missing the point. No one scrapes a site with an API, so comparing the two doesn't make much sense. Of course the API is better, but what good does that do us when we want data in a certain format and there is no API? Answer: no good at all. Not only does scraping not at all compete with APIs, it actually encourages development of APIs by establishing an existing market for structured data and creating a competitor for customers until the API exists.

Case in point: I scrape MySpace and provide RSS feeds. I don't even use MySpace myself, but I want to read the weblogs of my friends who do via RSS, so I made this scraper. When I put it online, I discovered there are many other people who want to use MySpace RSS feeds. When these people do a Google search for "myspace rss," they currently find a full page of results, begining with my scraper. Myspace.com only shows up on the second page. This is bad business for MySpace. They've lost control of the experience of these potential customers. They need an API.

And they got one. I don't imagine my scraper had much to do with it in this case, but I have scraped smaller sites who didn't provide a feed until my scraper was being used by a significant portion of their readers. This puts such sites in a position where they need to provide the structured data their visitors clearly want or lose those visitors.

Screen scraping brings an increased risk of breakage, as I've experienced a few times already with the MySpace scraper. But without an alternative API, the structured data is worth that risk for many people. Dare writes Web 2.0 isn't about screenscraping. I say Web 1.9beta1 is about screen scraping.