What does Google’s support of Microformats and RDFa mean?
Google’s recent search update included faceted search (which I wrote about last week) and something called “rich snippets.”
These snippets “[provide] users with a convenient summary of a search result at a glance,” which work well for search listings like reviews, ratings, event information and other forms of small, structured data.
A little more clarification -
The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user’s desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo’s creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
Google is able to provide this information in their search results by indexing websites that include microformats and RDFa in their site structure. Take for example this book listing from Amazon -
It’s a clean and rich presentation of the important data related to this book; you’ll find the title, the author, the publishing date, a rating and summary. For Google to make use of this data and present it in their search results, the structure on the HTML side needs to look like this -
Similar formatting would be applied to everything from phone numbers, street addresses, postal codes to social networking information (like who’s your friend, co-worker and acquaintance) and might possibly be used by Google to index the things you’re paying attention to (like browsing history, musical preferences, tweets and favorite blogposts).
Yahoo’s Search Monkey as well as other search engines and web services are already doing some of these things but Google’s command of over 70% of the search market means that support of RDFa and microformats will soon be a must for large content providers.
And how does this affect Idealist?
We’ll need to start thinking about which data should be structured this way and which microformats we’ll want to use. Will we allow users to rate organizations, resources and so on? Will search engines have access to microformat information that might be deemed personal (even if the promise not to publish it) or should the user decide?
If you’ve ever heard the term “sematic web,” this is what they’re referring to. Exciting stuff, no?