The first version of about:mobile is up on devnews. Enjoy!
(Did you sign up for the email version yet?)

I wuv you.
You are currently browsing the monthly archive for June 2008.
The first version of about:mobile is up on devnews. Enjoy!
(Did you sign up for the email version yet?)
I thought it might be a good idea to make a little post about the first few days of whoisi just to give people a sense of how things are going on the backend here.
First of all, holy crap. Frankly, I had no idea that people would be as interested as they are. I knew I had a kind of quirky and weird idea and that it would be compared to a lot of other services that are out there but I had no idea the kind of response that I would get. It’s still a very very small site by any standards that I’m used to (since I work on such a huge project my sense of scale is way off) but it’s been pretty awesome to see the reactions and the pickup in the techie-web-2.0-bubble-conversation-cloud and watching on summize has been fun. It’s clear that a lot of people are in it for the gold rush without really understanding it but there are a lot of people who are learning how to use the site and, bugs and performance aside, are having a really good experience with it.
I wrote down a couple of numbers just as I was publishing my announcement blog post. When I posted, there were about 473 people on the site, the vast majority of whom I added myself. As of this writing there are about 2253 people now listed on the site. I suspect that’s mostly people adding themselves, but not entirely. There are clearly people using the site to follow large numbers of people and adding their friends. Not a huge number of people, mind you, but quite a few. I should have some numbers to share at some point on that. It will be a laughably small number if you’re used to google or friendfeed or facebook but for a guy who wrote this on weekends and nights I’m pretty happy with it.
Want to have some fun? Just sit there on the site and click on the Random Person link. When I used to do this all I would see was my friends. Now I get all this great random content. So frigging cool. It’s my favorite whoisi feature at the moment. I recommend investing a half and hour and just enjoying whatever people are doing.
Some notable quotes from tech folks:
So what has changed in these last few days?
Since it’s a weekend and night project for me I did a lot of stuff this weekend, but I wanted to call out a few people who really helped me out:
This isn’t the oscars. What changed?
I actually got a lot of stuff done this weekend, but not a huge amount of it is user-facing (other than the fact that it should actually be able to keep up for at least a while.)
When adding new people you get really weird results of other people who supposedly own that url. This problem was two fold. First, whoisi tries to figure out if there’s a duplicate person on the site already based on the urls that it sees. A person-id is a scarce resource and once you allocate that number it’s hard to change to something else so getting that right up front is important. Given that, one of the things that it does is look at the urls you pass in a look for matches. It also took the preview feed that it downloads and looks at the links in that to see if there are duplicate matches vs. other people. This actually worked really really well until people started adding del.ico.us bookmarks and google reader feeds to the site which include links to other people’s sites, making it look like ownership. So needless to say, I ripped that out at about midnight tonight. Clearly, I didn’t think that one through. Once again, thanks for the excellent reports that helped track this down.
Backend performance enhancements. Right now I’m basically completely bound to CPU performance based on how fast I can poll sites. The algorithm, as I described in a previous post, is pretty braindead and a lot of it still is. But at least it’s faster braindead. That’s what I spent most of the weekend on: buying myself some headroom for the future. We’ll see how long that lasts.
Fixed the XSS problem that was found. As I mentioned above, I spent a lot of time auditing code and I think that I found the problem areas. If someone else finds something, let me know.
Fixing broken HTML that was on the site. There were some missing closing div tags on some of the objects on the site. Frankly I’m amazed that things rendered at all, but those are fixed now. Also added a background color for the page for those crazy people who change that setting.
Added an error message so that if you go to the follow page and the people you’re following don’t have anything posted it lets you know what’s going on. This is that 500 that a lot of people were seeing. Lots of people added one person and went to the follow page. Bam, 500. Whoisi doesn’t show you stuff that is in a person’s timeline when they are first added. (This would cause an explosion on the everyone page, for example.) So you have to wait for them to start adding things before things start showing up on the timeline. So I added a page that explains that.
If you’re into mobile stuff and you want to know what’s going on with the Mozilla Project on the mobile front, take the time to subscribe to our mobile newsletter called ‘about:mobile’. (or follow it on The Mozilla Developer News Weblog. But, really, you should get on the mailing list because it’s cooler that way.)
We intend for it to be monthly, but might post more often than that if we have interesting things to report. So sign up! Only takes a couple of clicks.
Update: I just added twitter.com to /etc/hosts and pointed it at a site that doesn’t have a webserver. Works for now until twitter comes back.
Having some lunch and I thought it might be worth a small post while my burrito cools.
I just had to disable polling on whoisi because twitter is down. Again. Whiosi’s polling system, in case you were wondering is basically as dumb as a wooden post right now. I’m not trying to pretend that I thought it would work forever, nor that it was very good. But it works well as long as the internet is pretty healthy and the number of failures is evenly spread out among sites on the web.
Here’s how the poll system works right now for each site in the database: refresh every site every n minutes where n is a random number between 1 and 30. That’s it. And it does that for everything. No backoff, no per-site limits, etc. It’s easy to plug that kind of thing into the code, but it’s yet another thing on the “not yet done list.” Designed to be smart, but without the brains behind it.
You also have to understand how jobs are run. Jobs come from two sources, the “master service” (which I’ll describe in a later post) and the web site. But they all run through the same job queue. So when you try and add a new person it tries to go out and make a little preview of the site. That job has to compete with site refreshes that are also underway. The limit on the number of jobs that can be run at the same time is also dumb. Right now it’s 50 at once. Not 3/sec or 50 waiting for I/O, just 50 in progress.
So when you have a few hundred twitter accounts you’re polling and they fail by having to time out, the queue gets backed up. Given how many people are adding accounts right now I thought it would be good if the site interacted well instead of having things refresh instantly. It’s a tough choice but it’s how it is until twitter recovers from whatever its latest pain is.
I wish that twitter would fail by giving an immediate 500 or even a connection refused. The slow death of waiting for a response is basically the worst possible thing that can happen. Fail faster. Please.
Not that I should throw stones for even a second, given how dumb my code is. But just a lesson and what happens when a (dare I say important?) service dies.
I’d like to announce the general availability of a project that I’ve been working on in my spare time for the last few months: whoisi.com.
As a disclaimer please note that this project is not related to the work that I do at Mozilla nor is it endorsed by my employer in any way. It’s a completely independent project.
If I had to describe it in one sentence I would describe it as a site that lets you easily keep track of what your friends are doing on the Internet via RSS feeds – but with some twists.
First and foremost the site is organized around people. Everyone has an entry that gives an overview of what they are doing on the internet – weblog posts, flickr photos, etc. But that entry is created entirely like a wiki. Anyone can edit anything. If you notice that your friend’s entry is missing a twitter account or a flickr account you can add them. No asking them to sign up for yet another fracking account somewhere. It’s entirely up to the follower to manage their friend’s entries.
This flips the usual model of the social networking site on its head. It’s up to followers to keep track of their friends and the accounts they happen to have. I tend to say that most social networking sites these days are organized around accounts instead of people, and tend to have all of the pain that requiring accounts creates – you have to convince your friends to participate and set up accounts, remember another password and then they have to come to the site often. It’s a system that creates winners and losers because there’s only so much attention available in the world. Whoisi tries to avoid that whole problem by using a different model. In this sense, whoisi is very different. And is really an experiment. I have no idea if people will like this model or not or how they will react to it.
I currently use the site to keep track of 309 people, which adds up to 728 feeds in total. That includes flickr accounts, twitter accounts and web feeds. Imagine what that would look like in a feed reader. Totally insane, right? But not with this site. I can keep up with what’s going on without it being a completely overwhelming experience. And not one of those people was required to create an account for me to keep up with what they are doing.
So let’s take a quick walk through the site and its main features.
I think that one of the most powerful features of the site is the fact that anyone can edit anything. Anyone can add another site under a name, anyone can remove a site that someone else has added that’s wrong and anyone can add aliases and group information to any entry.
I got sick of discovering friends on various sites by accident and having to add them. The knowledge of where people have accounts is stuck in too many people’s heads and this is just a tool to move that information into the world. For those of us who lead pretty public lives and interact with a lot of other people who do the same and post a lot of different places, this can be very handy. In this sense it’s a single source for your and your friend’s online personal profile and no need to have that person create an account.
The flip side of this is that the site is open to defacement or people adding entries that you might not have expected. Wikipedia has the same problem. On the defacement front, I’m collecting history data so that I can provide history for a particular person and it should make it easy to undo damage created by would-be vandals. I just haven’t added the History link to a person yet.
On the privacy front things are a little bit more tricky. I’ve had some pretty strong reactions to this site by people who were very surprised to find their information collected in one place. I don’t actually think that whoisi is any worse or better than Google in this sense, except that it does make some things a little more convenient. Search that’s centered around people and their activities is one thing that whoisi enables. I’m not entirely sure how to balance that against privacy concerns. But I’m open to ideas on this front and how to manage privacy better. Right now it largely punts on the issue and assumes that public information is public information.
It’s important to realize that one of my design goals with whoisi was to deal entirely with public data. There’s no way to add password-protected feeds or hidden feeds. They have to be as public as any search would reveal. It’s just gathered around people instead of activities. A different noun than what Google searches on.
One of my design goals was to let people using the site to stay as anonymous as possible. To this end, there’s no requirement that you make an account before you start following people.
If you search for someone and you want to follow them just click on the little “Follow Person” next to their name. That’s it. The information is referenced with a cookie and no other sign-up is required. No need to sign up for an account you need to delete later. If you want to remove your association with the site, just remove cookies for whoisi.com from your browser.
If you need a link to log in later to the site from another computer, there’s a link on the right hand navigation that will provide the links needed to do so. I suggest that you mail them to yourself. (People already have email accounts – why should I require that they sign up for another one?)
As per the previous entry about following people, there’s no account required. Much like wikipedia, it’s possible to edit any entry. To this end, it’s driven using CAPTCHAs and that’s it.
Also like Wikipedia, I collect and log information about changes and the originator, including source IP information and will expose that information in history logs. It’s just a feature that I haven’t added yet. But the data is being collected.
To allow people to take the role of curators, I will probably allow people and editors to self-identify if they choose in the future. But that’s down the road. For now it’s about the barrier to entry and being a public resource. And not requiring accounts feels like a big part of that.
I have a strong distaste for classic RSS readers. They are cluttered and have odd workflow, largely based around the original 3-pane design of mail readers. People aren’t Inboxes. People that don’t post very often take up as much room in the interface as those who do post often. That combined with the mail-like read/unread status often drives people like me to RSS bankruptcy. That’s why the following interface on whoisi is time-based. Originally seen as part of Mugshot and now more recently with FriendFeed, time based interfaces are all the rage. And for good reason – they work well.
As I said earlier I follow more than 300 people on the site and I usually look at it a few times a day. I can scan through what’s going on very quickly, ignore stuff that doesn’t look interesting and then move on. The amount of time that I have to spend scanning through what other people are doing on the web (a big part of my day job, as a matter of fact) is vastly reduced. I’m much more effective with whoisi than without.
One of the things that annoys me about just about every social networking site that’s out there is that there’s always a gold rush to get nicknames and urls. Whoisi takes a different approach. It lets you define multiple names for someone.
These can be other versions of the same name like “Chris Blizzard” vs. “Christopher Blizzard” above but it can also be short nicknames used on various networks. Above I have “mozilla:blizzard” defined. This is because I use the nickname “blizzard” on Mozilla’s IRC server. The fact that it’s just a pair of text strings separated by a colon means it’s really flexible and extensible to just about any group you want.
Whoisi’s search is also aware of this when you search using the search box. For example, if I search for “blizzard” on the site, the search knows to look on the right hand side of the colon to pick out a username or nickname. You can also search for “mozilla:blizzard” or any other network:nickname combination that you want. A very useful feature that I haven’t seen other places and something I’ve always wanted.
This also leads to an interesting side effect. We now have ad-hoc groups. Want to find everyone who is associated with Mozilla? Search for “mozilla:” without anything on the right hand side of the colon. Or “gnome:“. You will get everyone who has a mozilla: as part of their alias. This is a pretty underdeveloped part of the site and something that has a lot of potential.
You also might notice that I have an entry above that just says “@fisl2008″. This is me just playing with events. One thing I’ve always wanted is the ability to say “I’m going to be at this event and I would love to see others who are doing the same.” In this sense it’s like saying “I was @ FISL 2008.”
I would love to have a system that lets me see who is going to show up at FISL 2008 (like my example above) or GUADEC 2008 (I just added an @guadec2008 alias to my name, actually.) From there it would be great if I had a way to keep track and see what was going on – where people are meeting, what they are thinking, and I had a way to expose that to conference organizers or participants. Once again, it’s all ad-hoc and self-organizing and a very early thought. But it’s neat if you can do this with very simple systems. Another thing to look into down the road.
Lots of people use tiny urls to share information on twitter and other networks. It’s convenient to have a small URL to paste. So if you’re viewing something through the follow interface everything has a tinyurl you can reference. For text links it’s the little link that says something like “26 minutes ago” and for images it’s the link for the image itself. (I couldn’t find a clean way to do tiny urls for images that didn’t involve an ugly link for every image so every image is a tiny url.)
One of the big complaints that people have about tiny urls is that they don’t convey much information. And too many innocent people are being Rick Rolled. There’s an API that lets you look up a tiny url and get information of the target. (Here’s a sample script that calls the API.) I imagine that people could write extensions that could look up the target if you saw a whoisi tiny link on a page. All tiny links start with http://whoisi.com/l/HEXID – very easy to identify.
Minus the JS that the site uses, which should be cached after the first load, pages on the site are very small. The home page is 1.5k. The follow page is 10k. Pages for individual people range from 6-8k in size. I wanted small fast page loads which is one of the reasons why the site is so incredibly simple and text-based.
It also means that at least for viewing that the site works very well for mobile phones. (Editing right now is pretty AJAX-heavy and I don’t have cgi-based fallbacks in place yet so editing from anything other than an iPhone is dicey at best.) I’m hoping to have more interfaces for mobile in place, including an iPhone interface at some point. I use it from my T-Mobile sidekick all the time and people have told me that it works very well on their iPhones.
Another reason that the site looks the way that it does is that I’m a huge fan of simplicity and good use of negative space. The data should stand on its own and boxing and colors should not detract from what the real focus of the site is – what people are doing.
I’m a pretty big believer in Open Data. To that end one of my rules for this site is going to be that if there’s information that anyone explicitly adds to the site, they should be able to extract it and use it for any purpose they want. So one of the APIs that I have in place lets you extract the entire person and site database. (Here’s a sample script that does just that.) After all, I have no idea how long this experiment will last, and neither do you.
The APIs aren’t documented yet, but how they work should be very easy to decipher from the test scripts and the JSON-encoded output. Also, if you pass in something wrong the server will return an inelegant 500 code as opposed to useful error text. I’m not lazy – just busy.
As another example, here’s another script that will extract information for a single person based on their ID. And another one that will map urls you run across to entries in the database. (Note that the last script doesn’t work in every case yet because of feed services like feedburner that generate redirects so the final url that you might see in your browser isn’t the one that actually exists in my database. I plan to fix this by following those redirects at some point to make this a more useful service.)
But that’s just a taste of what I would like to do with the API. I think that taking this open approach will make the service more useful to people over the long run and might let people do things I’ve never even dreamed of. I believe that the best way to help people and learn about what people really want out of a site like this is to let go. The web survives on the oxygen of open data. It’s a fundamental component of the web’s success and deeply affected my thinking in building this site.
The number of sites that are supported by whoisi right now is actually pathetically small. I support flickr, picasa, twitter, linkedin and then any generic rss or atom feed. Those sites represent the vast majority of what my friends use. I plan on expanding that to include things like amazon (for wishlists), pownce and a bunch of other stuff like digg and reddit if those make sense. I would also like to know what other people would like to see on the site. Something that would enhance their experience instead of making it too noisy. That’s going to be a delicate balance.
Note that I’ve actually seen people add del.icio.us and last.fm links. They are just RSS feeds. And they work pretty well. They show up on the everyone page from time to time. Feel free to add them if you want. When it comes time to add support for them I’ll convert them to something that looks decent.
I haven’t added them. And I’m not sure what form they should take yet. Should you be able to get an RSS feed for everyone you’re following? How about for a specific person’s complete feed? Here’s the problem. If you want an RSS feed for a specific person, why not get it from them directly? I’m not adding much value by acting as a middle man in this respect. In fact, it could be argued that I’m removing value since those people might lose statistical information that they are gathering.
I’m pretty happy being the place that keeps track of where you can find people but tracking them using RSS might still be best done with RSS from that site. If someone wants a completely different interface rather than an RSS reader like I do, I think this site does well. But why dis-intermediate people from their audience?
Of course, this is only half my brain speaking on my topic. The other half says I should just add feeds for everything and be done with it. Because it’s really useful for people. So I might just ignore everything I just wrote and set up RSS feeds for people to use. I’m like that. Let me know what you think on the topic. I’m all ears.
Note that I will be adding feeds for “recent changes” to the site. Not necessarily for content, but for new sites and people that have been added. I’m sure that people will find that interesting and that’s specific to the site itself. Really useful.
Really, that depends on what people do with the site. There’s a bunch of stuff I know needs to be done. History links for individual people. A “recently added” link on the nav bar that shows you recent people that have joined and recent sites that have been added. The Follow page really needs to have sites that are added to people you’re following as part of the flow. And also some tools to make it easier to discover who is on the site that you might be interested. i.e. import my twitter contacts or something like that.
A lot of people would argue that those things should have been done before the site launched. But I’m a real believer in release early, release often. So I thought I should put my cheese out there in the wind and see how people react.
Thanks, and I hope that you find the site as useful as I do.
Joi Ito has a link to something he built called a flowgram [tiny link]. What’s a flowgram? It lets you tell a story while a series of photos and web pages are shown to demonstrate what you’re talking about. I’m so used to static presentations that when I was first using it I thought that the web pages were just screen shots but when I moused over them I discovered they were live. I was surprised at how well-programmed I am to have certain expectations about how a presentation should work. You can also pause the audio and the story, browse around for a bit, and then return to the story.
It’s great to see storytelling mixed with the live web. A very neat idea and can really reset expectations about what presentations could and should be. I’ve already professed my love for visual storytelling and this is right up my alley.
(And I’m sad that a lot of it is written in flash, but don’t worry – audio and video are coming to a browser near you.)
Much like my friend Robert I have been amazed by the Martian Skies entry in The Big Picture. (Robert also has an appropriate quote in his blog which is worth reading.)
The guy behind it is Alan Taylor. There’s a good interview with him in which he explains some of the thinking behind it:
…my parents used to always have Life and National Geographic magazines around the house, I fell in love with the visual storytelling way back then. When I was getting my feet wet in the online journalism world as a developer at msnbc.com, I had the good fortune of working alongside Brian Storm and a few others in MSNBC’s photo department, who were just phenomenal as far as selection, editing and presentation.
I wondered why other sites didn’t reach that level. Many have by now, but I was still frustrated by the presentation — either far too small, or trapped in click-after-click interfaces that were in Flash or just acted as ad farms.
I’m a very visual person and photography is my favorite form of art by far. I love this particular storytelling style. It’s different than most styles of telling stories because it relies somewhat on chance and the eye of the photographer to catch a moment in time that illustrates the underlying story. You can’t tell people about what you saw – you can only show them. I love that.
Also, there’s a good quote about the workflow that he uses to find photos:
I use Firefox to browse the wire on an internal site, wired up with Greasemonkey scripts to give me decent-sized thumbs, extract caption and photo ID from the IMG tags. When I find an image I like, I save it to a local folder until I get about 25 or so good ones to choose from. Then I open all 25 in Photoshop, arrange the windows in a horizontal tile and drag them around to get a rough ordering that makes sense. Then I start to edit out images that don’t make the cut, run a couple of recorded Photoshop Actions to size the images, and do some hand-cropping if necessary.
Yay for tools to customize your browsing experience!
The entire episode is full of awesome. Go watch it. His feeds are pretty awesome, too. (Thanks go to Jonathan Zittrain who set up the moment.)
3 hours left until download day starts (10am US pacific, 1pm US eastern.)
There are direct ftp links showing up on various sites already. Please don’t give those links to other people because it doesn’t spread the load out to our huge network of mirrors. (Like the one that we just added in Japan that does 4Gb/sec – thank you, Mizukoshi-san!)