For those of you following along with the experiment at home, initial tests suggest adding beer to a web geek meeting does indeed make it more interesting. Or maybe having it on a weekend does that. Or maybe a smaller group. Or pizza. In any case, I had fun yesterday, learned a lot, and I would no longer say “Denver’s web community is boring me.”

Here’s something I learned before lunch even started (aside from the restaurant not opening until noon — oops): I had always assumed punch card computers worked by circuits connecting through the holes in cards. But it turns out they were entirely mechanical, with air bursting through the holes and flipping switches on the other side. Neat.

And of course there were web-specific topics too. But you’ll have to come next time to experience all the excitement. Maybe we’ll swap out pizza for ice cream or something, continue experimenting. In the distance, I still have my eye on BarCampDenver. Things are looking up.

 

Since moving to Denver, I’ve made a concerted effort to familiarize myself with the Denver web geek community. I’ve signed up for every email list I could find and attended every meeting loosely related to what I find interesting on the interwebs. But frankly, Denver’s web community is boring me. I haven’t found a single person playing with iPhone web interfaces nor the Wii JavaScript API nor microformats nor OpenID nor anyone who went to BarCampDenver last year. And few even know what these things are.

The lack of interest in any of those specific trends is not itself a problem. They may all turn out to be just passing fads. But I think these are symptoms of a larger problem: Denver web geekery is not a creative industry; it's a manufacturing industry. There’s an important difference between a web manufacturer, someone who churns out sites on an assembly line schedule using the exact same tools over and over again, and a web artisan, someone who takes the time to investigate, compare, and understand those tools, could maybe fix them when they break.

I want to be in the latter camp, not least because the former camp is being gradually replaced by increasingly automated tools. The manufacturing industry is not a sustainable career path; robots can manufacture. The plethora of web-related jobs and scarcity of candidates (I’m seeking a new coworker, by the way) in Denver is, I think, another symptom of this problem. The jobs are open because they’re unappealing. They’re boring, low-paying, and bound for obsolescence. Just as monoculture is bad for biological communities, it’s bad for this industry.

That’s my theory anyway. So what should I do about it? I’m still trying to figure that out, but when I suggested to the Denver web design MeetUp that once a month meetings wasn’t enough, some good ideas came up. Specifically: 1) go outside, and 2) drink beer. So as a test of these ideas, I’ve proposed an event, and a few people have signed on already.

What: Web Geek Lunch
When: August 18th, 2007 12pm - 2pm

So if you or anyone you know is near Denver and interested in beer, pizza, free WiFi, and web geekery, please spread the word. Hopefully we’ll be doing this more in the future.

 

A while back I was presented with the possibility that I might be moving to Carbondale at the end of the year, or I might not. I decided to take the copout and put off this decision as long as possible. Meanwhile I’ve been more actively seeking out contract work so that if I did end up moving to Carbondale, I’d at least have some sort of income.

Last week Jessica finished her first term of teaching (half a semester), and she’s relatively happy with her job in Carbondale. Last week I also made more money from contract work than I did at my full time job. So I made a decision to definitely move to Carbondale. I’m not moving immediately because I have a lot I want to do before moving. In addition to packing up my life in Des Moines, I also have a lot of projects at work I’d rather not leave without completing. So I’m moving around the end of the year.

But I’m probably not quitting my job at the end of the year. We haven’t worked out all the details, but I’ll likely be working remotely for a few months at least, doing pretty much what I do now in Des Moines, only doing it in Carbondale. On the down side, I expect it will be slightly more work for everyone involved to communicate strictly with no face to face contact. On the up side, I will no longer eat all the candy in the accounting department. And I can fold my laundry while I read my email. I do that anyway, but with personal email. Now I’ll get paid to do it.

I’ll be happy to continue working on familiar projects with familiar people, but I’m also enjoying the freelance work I’ve been doing. It’s a good way to prioritize my seemingly endless interest in web development. Things people are willing to pay me to do tend to be more interesting than updating my existing unpaid web services whenever someone wants something more out of them. I’m tempted to pay someone to answer the endless stream of comments on my original MySpace RSS post.

So this is all good. The problem, if it can be called a problem, is that I’m now facing a scenario of having too much paid work. I’ll definitely prioritize the steady work from my current employer, but I hate to entirely give up on the freelance stuff. I have a lot of stuff I’d like to do, and I often think it would be nice to pay someone else to do it. So that’s what I’m going to do.

I’m going to keep accepting as much freelance work as I can get doing projects that interest me, and I’m going to take the money I make from that and pay someone else to do other projects that interest me. So I’m looking to hire web developers (and less so designers) with interests similar to my own.

What are those interests? Probably the main disqualifier is that I’m willing to do interesting work for little pay. I make more money than I need, not because I make a lot, but because I don't need a lot. And I expect anyone I hire to have a similar prioritizing of, say, making data make sense over high income. Students would be good.

Beyond that, I’m looking for developers who use technologies I use or at least that I’m interesting in learning. The former include PHP, MySQL, JavaScript, CSS, semantic HTML, and a few other odds and ends. The latter include Ruby on Rails, Python, PostgreSQL, Flash, and maybe Perl. Basically, ASP, Java, and ColdFusion coders need not apply. It would be nice if one had an understanding of basic things like binary and HTTP, but I don’t really want to get too picky about specific knowledge. I’m more interested in curiosity.

So if you know anyone looking for some interesting web development work for reasonable but not extravagant income, please let me know.

 

Steve Rogne is a friend of mine from university days. We were apartment mates for about a year and a half. He recently became the Director of Zen Shiatsu Chicago. I’ve done a bit of revamping of their website for him, including giving Steve his own URL (because everyone should have a URL). I hope to get a blog set up for them soon (because everyone should have a blog). Speaking of blogs and new jobs, Dan has both (as everyone should).

Back to me. Last week I met with the bassist — let’s call him "Chris" (because that’s his name) — and we "jammed." Whenever anyone talks about "jamming," I think of it as some sort of improvisational music performance that I don’t know how to do. But really it’s just short hand for "playing music." At least that’s what we did. It went okay for the first time. It looks tentatively like the makings of a band (because everyone should have a band).

Speaking of bands, a week from now Jessica and I are having a wedding (because everyone should have a wedding). As far as the state of Iowa is concerned, we were actually married back in January, but the ceremony will be next weekend, and as far as our grandmothers are concerned, no ceremony means to marriage. We’ve attempted to plan it such that it will be more fun than stressful, so hopefully it will turn out that way.

If you’re interested in showcasing your home for a chance to win … looks like about $25,000 in prizes … Benjamin Moore’s current promotion began at 12am yesterday morning. I made the entry form. I also recently worked on the website for ICM, so if you need some work done on your ethanol refinery (because everyone should have an ethanol refinery), I recommend checking that out.

If you don’t yet have a URL, a blog, a new job, a band, a wedding, or an ethanol refinery, please let me know if I can be of any assistance. Because really, everyone should.

 

As I move projects to MakeDataMakeSense.com, I’m giving everything a pretty icon and otherwise trying to make it look more "professional," under the theory that people are more likely to pay attention to something that looks like it might be for sale. And this is apparently working as evidenced by one project now listed in the Museum of Modern Betas as a "beta by inheritance." I guess I just need to tack a meaningless "beta" icon on everything to complete the sell-out process (without actually selling anything).

 

The Western Iowa Advantage website went "live" (no longer a placeholder) yesterday. I’ve been working on it, along with other people and among other projects, for the past month or so. Everyone at work seems pretty excited about the result. I suspect the enthusiasm is largely due to the visual look of the site. It’s pretty. People like pretty.

But what I find most interesting about the site is something no one else will ever notice: it’s very semantic. The markup describes the data. The news is all hatom, the events are all hcalendar, and the personal and organization information is all hcard. You can run my greasemonkey script and import the events into Google Calendar. You can run the hcards through Brian Suda's X2V and get them into your address book. You can use Chris Casciano's NetNewsWire script to subscribe to the news without bothering with a separate feed (although there is a separate feed too).

And who is going to do these things? I expect absolutely no one. Certainly no one I know of using the site. So why do I bother? I don’t know. I don’t know why I like data so much. I don’t know why people like pretty things so much. Maybe some day I’ll figure it all out. Meanwhile, I make websites.

 

Dave Rogers points to Jarod Lanier, who has better ideas than me regarding the tendancy of the web to ignore individuals. He writes:

The illusion that what we already have is close to good enough, or that it is alive and will fix itself, is the most dangerous illusion of all. By avoiding that nonsense, it ought to be possible to find a humanistic and practical way to maximize value of the collective on the Web without turning ourselves into idiots. The best guiding principle is to always cherish individuals first.

That’s exactly the illusion I’ve been working under on the web for the past decade. I thought if I just threw good ideas into the web long enough, it would improve itself. It wears me out. "Always cherish individuals first" sounds like pretty good advice. I think I haven’t been doing enough of that because I don’t trust individuals very much. Individuals can be mean. The collective, at worst, is just careless. And therein lies the appeal of the illusion: it’s relatively safe.

But I’ve had enough with the relative safety of the idiot hive mind. It’s time for me to get back to mess and the riskiness of smart individuals. So … anyone want to start a website with me? I have a lot of good ideas.

 

Yesterday, Technorati released Microformats Search. My first thought was something like "finally..." It's been six months since Google released Google Base. At that time many people were pointing out that submission-based search isn't going to work in the long-term because it takes more work than crawl-based search, and all other things being equal, laziness wins. The advantage of Google Base over traditional search was that it used structured data, so the obvious solution was to make the web more structured.

Microformats make the web more structured, so I thought it would be interesting to see how much structured data a big search company like Google could hope to find by crawling instead of asking for submissions. I made Microformat Base, and before long, the microformats community, the broader semantic web community, and the entire world were ... completely ignoring it. No one used it. No one talked about it. No one copied the open source.

Well, that's not entirely true. A few people on the microformats discussion list said some nice things about it. But the conversation there quickly went back to what would later become hAtom. I went on with my life, playing with other technologies in my spare time. I hoped maybe in another year or two, someone with some venture capital would pick up the idea and make the web a more interesting place.

Then yesterday, as I said, Technorati released Microformats Search, and I thought "finally..." I think I have a pretty good track record for predicting where technologies are headed, and I continue to be annoyed by how long it takes the rest of the world to catch up to my imagined future. I didn't expect Technorati to act as quickly as it did, but I made Microformat Base in a day, so it really shouldn't have taken them six months.

I was happy Technorati had caught up, but then I started reading what Technorati was writing about it. The first thing I read was on the blog of Tantek Çelik. I subscribe to his blog because he talks about interesting technology, and I like to pay attention to where things are heading. Tantek wrote I invite you to come take a look at this first of a kind realtime microformats search engine At that point, I thought "Hmm...that's an odd way of phrasing that. It almost sounds like he has no idea that I ever made Microformat Base..." Then Tantek sent an email to the microformats discussion list writing There are some indexers of specific microformats right now (e.g. Reevoo and Kritx both index hReviews), but no general microformats search engine. At that point, I realized that Tantek really did have no idea I had made Microformat Base, which was surprising because I knew he had previously commented on it.

I wrote a response, saying Hmm... I'm pretty sure I was indexing contacts, events, and reviews several months ago...I'd assume you missed that, except that you commented on it. And Kevin Marks, who also works at Technorati, responded to that with Great stuff Scott, do you want to get pings relayed? At this point, I was trying to be charitable with my take on what was going on here, but it really looked to me like Technorati was intentionally ignoring what I had done, except where they realized that I could be feeding them data.

So I wrote that in response:

What I didn't expect was this feeling that microformats are increasingly just another product owned and sold by Technorati. I'm disappointed that Technorati has apparently developed selective amnesia here regarding others' work. Tantek says "Technorati believes in the voice of the individual," but here I am, an individual, and everyone from Technorati is pretending like I don't exist except where I could contribute more data toward Technorati's profit. I have no doubt that if I had done the same work at a corporation, I wouldn't be seeing phrases like "no general microformats search engine" and "first of a kind" coming out of Technorati. And I'm certainly not the only individual who has worked on this. Dozens of individuals helped lay the groundwork for Technorati's newest product, but not a single one is acknowledged in Technorati's discussion of the Microformats Search - only corporations. This will certainly make me think twice before experimenting further with microformats in my free time.

And I sent it off and thought "bridges: burninated!" I'm not working at one of Technorati's partners, so if microformats really are a product Technorati intends to claim or imply ownership of, I have little to lose by criticizing this trend. I was content to do so in the pseudo-privacy of the microformats email list, but then Tantek responded on his public blog.

He wrote a long post, starting with Mea culpa, including my name nine times, linking to me five times, and going on about how great it is that web workers of the world are uniting under the microformats flag of data freedom or something like that. And I guess I'm happy that Tantek is now reaffirming my romantic notion of what the web could be. The only problem is, I don't really believe it any more.

As I pointed out yesterday, a microformat search engine isn't the first project I've done a proof-of-concept for that later became a successful part of the web. In fact, nearly everything I've ever done online has followed this pattern, going back ten years to when I edited the raw compiled source of a browser plugin (a simpler task back then) to allow users to add any search engine they wanted to a field in their browser and released it for free as "AnySearch Extras". It's now ten years later, and FireFox is the most popular browser to have this same functionality built in. So about 10% of the world has caught up to what I was waiting for ten years ago.

I'm tired of waiting for the web to pay attention. The web is awful at paying attention. One might think Technorati would be a little better at paying attention than others, given what it does and its ownership of a trademark on the phrase "attention index." But experience suggests Technorati is just like the rest of the web. Interesting technology doesn't get the web's attention. Open source and open data don't get the web's attention. I've been doing both for several years. The web hasn't noticed. As Christian Montoya recently observed, money is what gets the web's attention.

It's not just that Tantek originally gave thanks only to corporations with money and ignored all the individuals working on microformats. Tantek's a busy guy, so he forgets things. But the whole web today gave notice to Technorati, with money, and ignored an individual who did the same thing, without money. Tantek wrote:

Companies take note - on the internet, there will always be smarter, more clever people building on each other's work than your secret internal committees, your architecture councils, your internal discussion forums -- no matter how many supergeniuses you think you may have hired away and locked up with golden shackles in your labs.

This has long been a popular mythology on the web, but I no longer believe it. I am the prototypical clever person building on others' work and encouraging others to build on mine. Companies can safely ignore me.

 

For a long time now, I've applied the "release early, release often" principle, only to have someone else "release later, release better."

Well I've finally learned my lesson and I'm done releasing early and often. From now on, I'm releasing later and better.

 

I have a rough plan to split this website, randomchaos.com into three different sites. A while back I bought typewriting.org, and I think I'll put my weblog, music, and other types of writings there. It will look simple and artsy like Oblivio or Letters to an Unknown Audience. Surely muted earth tones will improve my writing. I'll probably put the music at music.typewriting.org, write more about it, and have comments on it. I expect I'll stop hosting music for other people. The other people have never seemed very interested in my hosting their music. I think maybe I had a fantasy of starting a record label or something, but that's clearly not going to happen.

Today I bought MakeDataMakeSense.com, which I think is a nice description of what I like to do with technology. I plan to move my more tech-oriented projects there. The microformat aggregator, the graphing widget, the Greasemonkey scripts, the regular expression debugger, it all tries to make data make sense. So that site will focus on those projects. It will look like a software company's website, maybe Panic or Ranchero, but still free. Surely fancy icons will improve my software.

With everything else moved to the other two sites, I'll make randomchaos.com a games site. It will include fastr and other games I've been playing with and need to finish. It will look like kurnik or Yahoo games. Surely extensive white space will improve my games.

There are random projects throughout randomchaos right now that don't clearly belong on any of the three sites: typewriting.org, MakeDataMakeSense.com, or a new games-focused randomchaos.com. The photos? I'll either just stop keeping a separate photo gallery, or write more about photos and put it on photo.typewriting.org. The computer-generated poetry? That could probably fit in on either typewriting.org or MakeDataMakeSense.com. I'll pick one. Anything I can't find a home for is probably not worth keeping.

I expect one such thing will be the source code viewer. I don't think I'm going to make any of these three sites open source, though I will continue to make specific projects open source. I've been running randomchaos.com as an open source website for a few years now, and I think it's been a wasted effort. No one has suggested a single improvement on any of the code I've written. And the only people who have actually used the code elsewhere have first asked me if they could despite the license clearly granting them this permission. If I'm going to be interacting with people using my code, I might as well just email them the files and not have to worry about maintaining an automated open-source website system that no one ever uses.

I'll post notes and redirect everything as I move or remove it, but that's my plan so far: out with the old, split up the new, stop trying to boil the ocean, and drink more tea.

 

Danny Ayers does a quote of the day, mostly for semantic web stuff and I've decided to steal the format. Here's today's:

Basically, if you want to be gung-ho about it, the entire web is a copyright violation.

Roger Benningfield

 

I gather most people involved in microformats are coming from a background heavy in more formally structured data, e.g. RDF, XML, relational databases. I'm coming more from the opposite background: scraping. Recently Phil Jones described a web in which metadata resides in scraping/parsing applications meaning documents need not be so descriptive, and Danny Ayers predictably responded with an argument for the Semantic Web, in which metadata resides in documents meaning applications need not be so smart.

In Danny's comments, I tried to point out the applications Phil predicts can produce the documents Danny predicts. I already do a small amount of this with all my scrapers. On Disemployed, I add location and time information to each job posting and publish that information in a regular format (HTML, RSS 1, RSS 2, or Atom). I could admittedly be structuring this information more formally to better encourage reuse, but the data is there, in any case, where it wasn't before. But this is relatively simple data to add. I know when I found each job post and where it came from, so my application doesn't need to be very smart. What are the limits of a smart application? Could a very crafty application actually make microformats unnecessary?

Let's take one microformat, hCard, and see how guessable the microformat metadata would be if it weren't there, on a scale of zero to ten:

  • fn (full name): this could at best be a guess. A name could feasibly contain pretty much any combination of letters. I'm sure someone somewhere has named a child "Asdf Jkl." Microformats are the easiest way to identify fn. 0/10.
  • n (name): same here. 0/10.
  • nickname: again, no easy way out. 0/10.
  • photo: here we have a winner, mostly. I'm guessing eight times out of ten, any image referenced within something identified as hCard information will be a photo. Depending on how lucky we feel, microformats could be dispensed with here. 7/10.
  • bday (birthday): this is a bit complicated. Dates follow very standard formats, and we could probably identify dates in a jumble of text with about 95% accuracy. But how do we know if a given date is a birthday? We can assume relatively safely based on proximity to words like "birthday." 9/10.
  • adr (address): I would have guessed this would be very hard to identify as a pattern, but Google is already doing this. Of course, Google is limiting to US addresses. 5/10.
  • label: at first, this appears to be as open-ended as names, but the variety in practice is likely very limited. I would expect a list of a few dozen words likely to occur in a label (e.g. home, domestic, etc.) would catch maybe 7/10.
  • tel (telephone): this is a bit complicated. Having an address makes it much easier to tell if a given set of numbers is likely a phone number. Capturing anything that fits the patterns (###) ###-#### or ###-###-#### would get many phone numbers, and I suspect more is possible. 6/10.
  • email: This one is easy. An email address must fit a defined pattern, so we can discover all email addresses with no microformat, as evidenced by the proliferation of junk email. 10/10.
  • mailer: At any given time, there are only so many known email clients. 8/10.
  • tz (time zone): There are only so many timezones, and not too many ways to describe them. 9/10.
  • geo: Latitude and Longitude information is pretty much useless if it doesn't follow a certain pattern (decimal numbers between -180 and 180), but that doesn't mean all numbers that follow this pattern are geo codes. 6/10.
  • title: Theoretically unlimited, but practically limited. 7/10.
  • role: Words ending in "er" would catch a lot. Check for proximity to words like "job," "work," or "professional." 5/10.
  • logo: Just like photo, only probably smaller. 7/10.
  • agent: I had to look this one up. Auto-discovery doesn't look good. 0/10.
  • org: Just like names, only worse. 0/10.
  • categories: Could be anything. 0/10.
  • note: Again, anything. 0/10.
  • rev: Dates near words like "updated" or "modified." 7/10.
  • sort-string: Usually last word in the name. 6/10.
  • sound: Sounds have defined formats. 10/10.
  • uid: Pass. 0/10.
  • url: First standard link. 7/10.
  • class: Pass. 0/10.
  • key: Keys follow patterns. 10/10.

Average: 5.3/10. In general there are some areas in which microformats are entirely unneccesary, some in which they are entirely necessary, and some in between. Of course, these are mostly rough estimates on the potential accuracy of intelligent scraping. The actual accuracy would need to be determined by writing a scraper and pitting it against some actual data.

In any case, microformats appear well worth the expense to capture that 47% (or however much) of the existing information. Even though email addresses are entirely identifiable without any microformat, as long as we're wrapping names in name tags, it makes sense to wrap the email addresses at the same time so a parser doesn't need to be any smarter.

While not the absolute simplest method, microformats appear to be the lowest common denominator of structured documents. So now I think I was wrong when I wrote that we're headed towards a "semantic web" in which the semantics are forced onto websites by browsers and other intermediaries. I still expect that will happen (as I notice it happening, and cause it to happen), but given the practical limits of the smart-application method of connecting the world of information, it will only work as a bridge to a semantic web composed of metadata-rich documents.

 

In other microformat news, over the weekend I made a draft version of a "Microformats Zen Garden." The idea, introduced on the microformats-discuss email list, is an obvious knockoff of the CSS Zen Garden, only the (X)HTML is full of microformatted information, and JavaScript is added to the mix. I spent a few hours working on this, and when I was done, I realized the concept was not just an application, but almost a platform - a small hint at the mythical web-as-operating-system. Microformats act as the documents, CSS handles the visual style, and JavaScript acts as the applications. The only important thing missing is the ability to save edited documents, but Mark Pilgrim is already working on using Atom for that. I'll be very interested to see how this all materializes.