Yesterday, Technorati released Microformats Search. My first thought was something like "finally..." It's been six months since Google released Google Base. At that time many people were pointing out that submission-based search isn't going to work in the long-term because it takes more work than crawl-based search, and all other things being equal, laziness wins. The advantage of Google Base over traditional search was that it used structured data, so the obvious solution was to make the web more structured.

Microformats make the web more structured, so I thought it would be interesting to see how much structured data a big search company like Google could hope to find by crawling instead of asking for submissions. I made Microformat Base, and before long, the microformats community, the broader semantic web community, and the entire world were ... completely ignoring it. No one used it. No one talked about it. No one copied the open source.

Well, that's not entirely true. A few people on the microformats discussion list said some nice things about it. But the conversation there quickly went back to what would later become hAtom. I went on with my life, playing with other technologies in my spare time. I hoped maybe in another year or two, someone with some venture capital would pick up the idea and make the web a more interesting place.

Then yesterday, as I said, Technorati released Microformats Search, and I thought "finally..." I think I have a pretty good track record for predicting where technologies are headed, and I continue to be annoyed by how long it takes the rest of the world to catch up to my imagined future. I didn't expect Technorati to act as quickly as it did, but I made Microformat Base in a day, so it really shouldn't have taken them six months.

I was happy Technorati had caught up, but then I started reading what Technorati was writing about it. The first thing I read was on the blog of Tantek Çelik. I subscribe to his blog because he talks about interesting technology, and I like to pay attention to where things are heading. Tantek wrote I invite you to come take a look at this first of a kind realtime microformats search engine At that point, I thought "Hmm...that's an odd way of phrasing that. It almost sounds like he has no idea that I ever made Microformat Base..." Then Tantek sent an email to the microformats discussion list writing There are some indexers of specific microformats right now (e.g. Reevoo and Kritx both index hReviews), but no general microformats search engine. At that point, I realized that Tantek really did have no idea I had made Microformat Base, which was surprising because I knew he had previously commented on it.

I wrote a response, saying Hmm... I'm pretty sure I was indexing contacts, events, and reviews several months ago...I'd assume you missed that, except that you commented on it. And Kevin Marks, who also works at Technorati, responded to that with Great stuff Scott, do you want to get pings relayed? At this point, I was trying to be charitable with my take on what was going on here, but it really looked to me like Technorati was intentionally ignoring what I had done, except where they realized that I could be feeding them data.

What I didn't expect was this feeling that microformats are increasingly just another product owned and sold by Technorati. I'm disappointed that Technorati has apparently developed selective amnesia here regarding others' work. Tantek says "Technorati believes in the voice of the individual," but here I am, an individual, and everyone from Technorati is pretending like I don't exist except where I could contribute more data toward Technorati's profit. I have no doubt that if I had done the same work at a corporation, I wouldn't be seeing phrases like "no general microformats search engine" and "first of a kind" coming out of Technorati. And I'm certainly not the only individual who has worked on this. Dozens of individuals helped lay the groundwork for Technorati's newest product, but not a single one is acknowledged in Technorati's discussion of the Microformats Search - only corporations. This will certainly make me think twice before experimenting further with microformats in my free time.

And I sent it off and thought "bridges: burninated!" I'm not working at one of Technorati's partners, so if microformats really are a product Technorati intends to claim or imply ownership of, I have little to lose by criticizing this trend. I was content to do so in the pseudo-privacy of the microformats email list, but then Tantek responded on his public blog.

He wrote a long post, starting with Mea culpa, including my name nine times, linking to me five times, and going on about how great it is that web workers of the world are uniting under the microformats flag of data freedom or something like that. And I guess I'm happy that Tantek is now reaffirming my romantic notion of what the web could be. The only problem is, I don't really believe it any more.

As I pointed out yesterday, a microformat search engine isn't the first project I've done a proof-of-concept for that later became a successful part of the web. In fact, nearly everything I've ever done online has followed this pattern, going back ten years to when I edited the raw compiled source of a browser plugin (a simpler task back then) to allow users to add any search engine they wanted to a field in their browser and released it for free as "AnySearch Extras". It's now ten years later, and FireFox is the most popular browser to have this same functionality built in. So about 10% of the world has caught up to what I was waiting for ten years ago.

I'm tired of waiting for the web to pay attention. The web is awful at paying attention. One might think Technorati would be a little better at paying attention than others, given what it does and its ownership of a trademark on the phrase "attention index." But experience suggests Technorati is just like the rest of the web. Interesting technology doesn't get the web's attention. Open source and open data don't get the web's attention. I've been doing both for several years. The web hasn't noticed. As Christian Montoya recently observed, money is what gets the web's attention.

It's not just that Tantek originally gave thanks only to corporations with money and ignored all the individuals working on microformats. Tantek's a busy guy, so he forgets things. But the whole web today gave notice to Technorati, with money, and ignored an individual who did the same thing, without money. Tantek wrote:

Companies take note - on the internet, there will always be smarter, more clever people building on each other's work than your secret internal committees, your architecture councils, your internal discussion forums -- no matter how many supergeniuses you think you may have hired away and locked up with golden shackles in your labs.

This has long been a popular mythology on the web, but I no longer believe it. I am the prototypical clever person building on others' work and encouraging others to build on mine. Companies can safely ignore me.

If it helps any, my thought process with the Technorati announcement followed a line of "the semantic web is finally born!"..."thinking that is giving me deja vu"..."oh yeah, this is just a free-query Microformats Base."

I empathize with you, but fear you're attributing your obscurity incorrectly. There are thousands of "us" out there that were separating content and presentation before the CSS Zen Garden, using microrequests for live page updates before AJAX, assigning semantics to CSS class names before, etc. It's not that we don't have money that makes us obscure, it's that we don't already have attention.

Technorati already is in the zeitgeist, and them doing a microformat search means that many people who make decisions at GYM will be thinking about it. You doing the same thing is a technology demo that only other geeks think about.

Companies like Technorati serve a very important role in our ecosystem. Tantek is a geek who notices things like Microformat Base, but also makes decisions and can choose to expand upon the idea because Technorati is small enough. But they're large enough that by doing so he's promoted the idea into a larger market.

If you had a lot of money, you'd still have to work with the big corps to affect change (like Dave Winer). If you want your ideas to make a bigger splash, you need to embrace the market's attention economy and either work for Google and make them 20% projects with that all-important brand name on them, or spark the interest of folks like Tantek.

It's disappointing that he didn't originally credit you, and moreso that he pretended this was a brand-new idea. But the mythology isn't completely false, you still made a difference. You just don't get to be the one to profit off it. :-(
Hans, I think you're right that it starts with attention, but I've had plenty of attention in the past. The problem I have is that it doesn't last for me like it does for Technorati or any other dot-com because I'm not monetizing the attention and re-investing that money in getting more attention.

I don't do that because I don't want the money. I want to maintain this fantasy that I can improve the web without the money. But I can't. I've been trying it for ten years, giving away open data, open source, open ideas, for free, and it hasn't worked. It's gotten me a lot of attention, but attention doesn't lead to improvement. Money leads to improvement. I want improvement, so I need to start thinking more about money.
Technorati does seem to suffer from garden variety of NIH syndrome, in my experience. I had a similiar interaction with Kevin Marks at a something-camp event last year: an attempt to converse about an existing free software implementation of certain microformats was met with the equivalent of a blank stare. "That's nice ... [back to e-mail]."

My own impression is that microformats are getting a lot of attention from vendors and aggregator-producers (who would benefit directly from an ecosystem of such data), but zero-to-none from writers and publishers. Technorati focuses on the former, because it's easier to convert a single Ray Ozzie and get the same kind of publicity bang that might result from 100's of little guys like you. The problem is that only the latter kind of attention has actual legs.
Scott, I think you greatly underestimate the potential impact of your work. The challenge is that not every work will necessarily meet enough other folks' needs to "tip" and surge. Sometimes it is just a matter of time. Consider that as microformats are more widely recognized and used, more folks will look for open source to start with, and will inevitably find your work (either by searching for it, or by finding references to it on the microformats wiki). Sometimes you never know what will take off when. That's no fault of yours nor anybody else's, that's just the way chaotic markets of ideas work.

Hans, you make some good points. I'm not sure I agree with all your examples. CSS Zen Garden for example was developed by a web designer in Canada who was quite obscure at the time - of course plenty of us (myself included) had been separating presentation from markup for quite some time. The key thing that CSS Zen Garden brought to the picture was a very high level of aesthetic design which served a critical role in convincing people that it was possible to do good design in CSS.

Regarding "AJAX", how do you think the Google engineer(s) feel(s) who used AJAX techniques to build Google Maps etc. and have never been recognized for it? Such obscurity doesn't just happen to individuals producing open source -- sometimes it happens to very smart people making very good tools inside large companies as well.

There is also an aspect of persistence to this. That is, you might spend years creating things that you thought people should pay attention to, and be ignored, and then one day build a weekend hack for a friend, and have that be the thing that takes off. You could say I have some personal experience with that.

Regarding the failure to originally credit Scott, what more can I say? I'm human and made an unintentional omission. It certainly won't be the last mistake I make.

And you're certainly right that the idea isn't new. What I hope is understood as new is the specific implementation of a flat microformats search (as opposed to a fielded search) that both searches all the fields of a microformat simultaneously, and nearby text as well for enhanced relevance, as well as being able to simultaneously search across multiple microformats rather than just one type at a time. Clearly doing fielded search as Microformat Base does is quite useful as well -- I'm just hoping people understand the difference. And as far as profiting from it, that too is yet to be figured out. My hope is that the community will figure out and pursue a variety of business plans that use microformats.

Michael, very sorry to hear that you think Technorati suffers from NIH, as I can definitely say we actively fight it internally. If I put myself in Kevin's shoes, I have a feeling that any kind of brush-off you may have felt was due to continuous partial attention rather than any actual dismissal of whatever you were showing. It is my hope that even if any one vendor (or many vendors) do(es) suffer from NIH, the fact that we have everything on an open wiki will reduce the chances for "unintentional NIH". I'm also very much open to suggestions to how we (Technorati) or we (the microformats community) can further discourage NIH. Whenever you see us or anyone else doing it, I encourage you to call it out, and we'll do our best to correct it.

As far as attention, ironically I think writers and publishers (including vendors) are paying more attention to microformats than most aggregators, since microformats are designed to be easier for publishers than aggregators, and the immediate benefits are targeted at the publisher (e.g. X2V conversions for their users etc.) Obviously as more microformatted content published, that may change and we may see a shift in the interest of publishers as well.

Apologies to Scott for leaving such a lengthy comment, but the issues you guys raised deserved at least some discussion. I'm doing my best to learn from your feedback and will hopefully do a better job in the future.


Tantek, I don't mean to make you the focal point of my criticisms of the web in general, nor do I mind lengthy comments, nor do I think Technorati suffers from NIH (a new term to me, means "Not Invented Here" for anyone else not aware) significantly more than any other large organization.

My recent change is that I can't continue chalking up the web's collective attention to the "chaotic markets of ideas" you describe. I think the market of ideas is heavily skewed towards those who can afford and care to skew it. Bloggers are increasingly aware of their ability to skew the market of ideas. BoingBoing, for example, doesn't write negatively about their ideological opponents for their own entertainment (e.g. Google "underwear perverts"). They do it because they are very aware that they have the power to shift focus on the web. They know that because they're regularly at the top of popularity lists like Technorati's. I don't see much chaos in this market of ideas. It appears to be very predictable.

So here's my prediction: closed source commercial projects are more likely to succeed on the web. My most successful projects have been closed source and commercial. Despite conventional wisdom to the contrary, I've seen evidence suggesting open source and free is less popular. Maybe if I just stick to open source freebies long enough, the web will improve itself. Maybe if I dance just right, it will rain. But I doubt both.
I read the discussion with interest. If you excuse me for digression, I have a comment on Scott's prediction based on by own experience as a consultant-architect. The software world is increasingly moving towards hosted applications where the source code is typically not installed by/delivered to the end user. Hosted applications have historically been using closed source software but are increasingly including open source software as well. This is partly because they don't have to redistribute the open source components and partly because the open source components are proven. And this trend is on the increase, not decrease.

