What's That Noise?! [Ian Kallen's Weblog]

All | LAMP | Music | Java | Ruby | The Agilist | Musings | Commute | Ball
Main | Next month (Feb 2006) »

20060129 Sunday January 29, 2006

The Ironies of Blogger

Off and on, I've been using a blog on blogger as a link blog and for other "quickies" to post. Tonight, I was closing some browser tabs with stuff I'd been reading about BerkeleyDB (JE) and went to post them here with ecto.

Strangely, ecto was giving me errors, so I check its console:

Response:
Your post has been saved as a draft instead of published. You must go to Blogger.com to publish your post. To prevent these errors in the future, request a review at:
http://www.blogger.com/unlock-blog.g?blogID=ABCDE
Sorry for the inconvenience
Bizarre! So, I go to the "unlock-blog" URL and get this:

Inconvenience?! The links are to this page that outlines how feeble their anti-spam measures are. They don't have spam-prevention robots, who are they kidding? Will Robinson had a robot. Want to get me started on inconvenience?

The irony here is that it's relatively trivial to produce a list of recently created spam blogs from the blogspot service merely by watching the publish activities on any given day. I do it all of the time. There's a massive amount of spam on that service that percolates up from some simple correlations. And Technorati's index is subsequently scrubbed often on that account. It's a little inconvient. Yet the blogspot people continue to rely on the "Flag As Objectionable" button as their main weapon. Get a clue! Look how trivial to game that system is; I'll surmise that I got googlebowled (bloggerbowled would be more apt) by some dingleberry black hats. Very inconvenient. Given the rampant fraud going on within blogger, it doesn't take a rocket scientist to see.

Google is pretty much operating an open-relay, the blogosphere's equivalent of an SMTP spam-mill, because they lack the imagination to watch their own numbers and their spam rolls out unabated. This has been a ballooning problem for at least a year and a half. It's actually kinda inconvenient. Don't think they haven't been advised. Long before maverick-man coined the term "splog" I'd been sending my friends at the big G data on the extent of their problems. They know.

Dudes, go down the hall and ask the smart people at gmail how they do it. Now I have to go through the blogspot captcha gate instead of using ecto to post on their platform, hardly worth the bother.

Enough of the ironies and inconveniences. On the subject of ecto, please think kind thoughts for Adriaan. I am.

( Jan 29 2006, 11:20:57 PM PST ) Permalink


20060125 Wednesday January 25, 2006

HTML in the Real World

Google has a study of how HTML is really being used out in the wild. They've posted their results, Web Authoring Statistics

December 2005 we did an analysis of a sample of slightly over a billion documents, extracting information about popular class names, elements, attributes, and related metadata. The Good, The Bad and The Ugly: it's all in there.
A billion documents sampled, nice!

I think this is a demonstration of Google's expanding interest in grokking the semantics of that are latent in document structures. The results are broken down by

That's pretty good coverage!

I'd love to see how the data changes over time. I suspect parts of the web are becoming more orderly (web 2.0 applications are likely using well formed document structures) while the web as a whole is probably atrophying (the vast installation base of crappy or misconfigured tools are likely the preponderant generators of markup). I'm anticipating a lot of interesting data emerging as the ascendance of microformats continues. Goog, looking forward to follow-up surveys!

( Jan 25 2006, 06:37:29 AM PST ) Permalink


20060115 Sunday January 15, 2006

Surfacing Microformats in Firefox

Calvin Yu has published his Tails Firefox Extension that will surface the microformats on a page, looks good! The plugin shows the hCals and hCards in a page, he's got a nice screenshot of the hCard renderings. Adding contacts (like Smartzilla) and events would be nice but the plugin could just point at the gateways for contacts and events on Technorati. BTW, for Firefox you can one-click install Technorati search in the browser search box on the Technorati Tools page.

( Jan 15 2006, 04:29:28 PM PST ) Permalink


20060114 Saturday January 14, 2006

Distributed Conversations with Microformats

Last summer, Ryan King and Eran Globen blogged about citeVia and citeRel as a means of denoting conversation semantics between blog posts. A good summary and subsequent brainstorming is on the Microformats wiki. The blogosphere is currently rich with implicit distributed conversations. A little explicit microformat boost is, IMNSHO, exactly what's needed to nail the coffin in a lot of the crufty old centralized group systems (like Yahoo!'s and Google's). The future of virtual community is here and it is in conversing with blog posts.

There's a lot of discussion of primary citations and secondary props ("Via") but there's not as much on reply-to semantics except for in a few of Eran's posts. Isn't reply-to central to a conversation? Citations are more bibliographic (like when you're linking for a definition, a quote or to identify a source). On the other hand, conversations are about exchanged replies. This is as old as the Internet. Email clients put Reply-to headers in messages when you reply to them. RFC 850 defined it for NNTP over twenty years ago. Reply-to has been the binding for conversations for years, why stop now? That doesn't mean not to use cite and via, those are cool too but they're orthogonal to conversing and more pertinent to definition, quotation and source identification. I'm not entirely sure how I'd like to use via since it's kinda like citing a cite -- maybe it's not necessary at all. If you think of a via as a degenerative quote, then use quote. For instance, I think this makes sense (but then, I had a few glasses of wine earlier... I might not feel the same way in the morning):

I might agree that <a href="http://theryanking.com/blog/archives/2006/01/08/blogging-while-adult/" rel="reply-to">negative sarcasm</a>
happens (and worse) wherever there is <a href="http://en.wikipedia.org/wiki/Anonymity" rel="cite">anonymity</a> it is one of an inductively provable 
aspect of human nature. Countless discussion boards have failed (and continue to) due to participant anonymity. However, it's also important to weigh 
in with the benefits of anonymity, would citizens of censored and oppressed societies be able to  engage in progressive debate without it? 
Take a look at the Global Voices' <blockquote cite="http://joi.ito.com/archives/2005/05/23/second_draft_of_anonymous_blogging_guide.html">
<a href="http://cyber.law.harvard.eduglobalvoices/?p=179" rel="cite">Anonymous Blogging Guide</a</blockquote>.
Wine bottle is corked now. Does that make sense?

( Jan 14 2006, 01:44:44 AM PST ) Permalink


Better, Faster Technorati Blog Embed

Willie Dixon was built for comfort, Technorati embeds were built for speed!

Here's an inside tip: if you are a Technorati member and you claimed your blog a while ago, you can likely optimize how your Technorati embed is served and thus speed up how fast your page renders. Go to your account page for your first claimed blog (or go through them all one by one and click Configure Blog). Does the blog embed code match what's in your template? Load your blog page and View Source to compare. The old school embed code looked like this:

<script type="text/javascript" src="http://technorati.com/embed/[BLOG-CLAIM-ID].js"> </script>
What you'll find on your account page is this:
<script type="text/javascript" src="http://embed.technorati.com/embed/[BLOG-CLAIM-ID].js"> </script>
How is this an optimization? Why should you bother updating your blog template from the old to the new style? It's faster! We optimized serving the blog embeds with some additional infrastructure not too long ago. The old way works (built for comfort) but the new one works better (built for speed)!

( Jan 14 2006, 12:23:30 AM PST ) Permalink


20060113 Friday January 13, 2006

Technorati Is Hiring

Technorati is hiring engineers for the website. You should be expert with PHP (including OO constructs, PEAR libraries, templating and application frameworks -- what works and what doesn't), savvy with XHTML and CSS -- be ready with referencable URLs to demonstrate, experienced with web 2.0 services (i.e. even if you don't blog you podcast or addictively use technorati, flickr, del.icio.us, digg, reddit, rollyo, squiddoo, etc) as well as having programmed in at least one language other than PHP and Javascript. Lotsa bonus points for using microformats and Ruby on Rails!

This position is full time, requires US work eligibility and is on-site (San Francisco, 3rd and Brannan). So, is it you? Check out the job listing and send your resume!

( Jan 13 2006, 01:35:28 PM PST ) Permalink


20060111 Wednesday January 11, 2006

Google Earth on Mac OS X

I'd played around on a Windows box with Google Earth a bit last summer and was both enamored with technology and saddened by the absence of Mac OS X support. Well, happy days are here again: a Mac version is out now!

The satellite definitely took new pictures of my neck of the woods, last time I'd checked you could see our car in the driveway of our house. Now there's a long shadow over it like they shot the picture very early in the morning, can't see the car but you can see the garbage cans (well, the resolution isn't that good, they look like little blips).

Thanks, G!

( Jan 11 2006, 08:48:32 PM PST ) Permalink


20060109 Monday January 09, 2006

Claim Your Blog and Put Technorati Pinging On Your Browser Bookmark Bar

A lot of people blog on platforms that don't ping for them. They could just use ecto, it'll help with the post formatting, tagging, media integration as well as pinging. One of the features for Technorati members is that the ping page will render a link to initiate a ping for each of the blogs you've claimed.

If your blog platform won't ping on your behalf, drag those links up to your bookmark bar and click them whenever you publish a new post. The world is changing all around us. When you post, you're part of that change. When you use Technorati, you can watch it change. Welcome to the Real Time Web!

( Jan 09 2006, 10:40:47 PM PST ) Permalink


More Kudos

We must be doing something right. More kudos, this time from Jason Calacanis.

( Jan 09 2006, 12:46:48 AM PST ) Permalink


20060108 Sunday January 08, 2006

Kudo's For Technorati's Anti-Spam Effort

Props from Jeremy on our anti-blog spam efforts are certainly appreciated. I know we don't have a spam-free index, however the amount of spam we keep out of the index is truly astonishing. Our ping interface is deluged with a torrent of rubbish but we do our best to scrub the nasty stuff out of our update stream. The problem defies conventional mail spam or even blog comment spam analytic techniques as the structure of blog spam is very different. Deep examination of the content and structure across a pattern of web sites is often required to distinguish it as spam but in the end, the indicators are there. Most spammers' publishing behaviors are statistical outliers by nature; the numbers speak for themselves.

We have a lot to do, on this and on many fronts but we try to pay attention to the gripes as a measure of priorities. The kudos are nice, too!

( Jan 08 2006, 08:29:31 PM PST ) Permalink


20060107 Saturday January 07, 2006

MSN bows to China

As I expected to hear about after first reading of Microsoft's policies were reported last summer, MSN has (as reported by msnbc.com) censored a Chinese blog at Beijing's request.

IMO, it behooves the Chinese speaking blogosphere outside of China to vigorously discuss this. Beijing will have to adapt or retreat into isolation, they (and the world) can't afford the latter.

( Jan 07 2006, 08:49:20 AM PST ) Permalink


20060101 Sunday January 01, 2006

No Vacancy

This blog had a nice long vacation but it is now occupied, again. No, I wasn't in Borneo. I wasn't kidnapped by aliens (you never can be sure though, can you?). Nor was I in the hospital. I just found myself wanting to fix my platform but always too busy to do it. So I just didn't blog at all (except for on my super secret alter-ego blogs). While my efforts at going from 0.98 to 2.0.x of Roller never seemed to work out, I did get it to a 1.1 release (hey, take a little progress if you can't get it all). Most of all, I ditched my old template and stylesheet, they were pretty long in the tooth... (I think) this seems a lot cleaner.

A lot has happened with Technorati, the blogosphere, my deep dives into various technologies and other stuff. And there's more to come. And it's a new year. And speaking of which, it's that time again.

So here are my :

I'm going to get off my butt and get my cardiovascular system working. I'm going to overcome this rotator-cuff injury I've been hoping would just get better by itself (but never has). Ten years ago, I was physically fit easily, never got fat, injuries just healed themselves and I had no lack of physical agility and stamina. It didn't seem to matter that I didn't really try to take care of myself. Well, what a difference a decade makes and it matters now.
No, I don't need a new calling plan. I need to maintain my personal relationships a bit better. Between work and being with the kiddos and my better half, most of my other relationships have suffered.
I'm going to hit it out of the park with Technorati and live happily ever after. Or something like that. Last year, much of the effort at Technorati was focused on scaling models that can keep up with the blogosphere. Maybe we're not out of the woods now but we're in much better shape now than we were a year a ago (or even the duration in my blogging lapse). In 2006, it's showtime. See that fence about 339 feet away from the plate? Watch the ball go over the fence.

OK, so maybe it's all very self centered. Yea sure, somewhere along the way I'll be working to make the world a better place, too. But first things first.

Happy 2006!

( Jan 01 2006, 10:33:29 PM PST ) Permalink