Microformat Base

The launch of Google Base inspired a bit of armchair quarterbacking about how Google might have done it differently. One suggestion, popular - of course - among the microformats community, was that Google could use microformats to remove the need for submission to their base and leverage the distributed nature of the web.

Personally, I suspect there's just not enough microformatted content out there yet to make it worth Google's cycles parsing it. Lucky for me, my own parsing cycles aren't so valuable. Microformat Base is my attempt at a microformat-based alternative to Google Base. It's slowly crawling the web looking for microformatted content, and adding it to a structured database, searchable by microformat class names. There are plenty of improvements to be made, but it's already functional in the most basic form. You can find several vcards for people named Tantek, for example.

If anyone's interested, it's open source and will eventually be open data in some form or another. I'm not looking to start a new public search engine — just demonstrate that someone with more time and experience than I and maybe an existing web crawler (*cough cough*) could do something like this. I suspect a decent search engine would inspire more microformatting, and may prove the best way to work around the chicken-egg adoption problem microformats currently face. Until someone else builds it better, I'll keep tweaking Microformat Base to that end.

I currently trying to use Microformats to represent diabetes data. If you've like to help with this, have a look at my Diabetes Data wiki at http://diabetesdata.pbwiki.com


Be number 2:

knows half of 8 is