CategoriesArchiveSearchRandom
View the archives

or

Search the archives
 

Archive for the 'technical' Category

A silver lining

Thursday, July 28th, 2011

Building on our previous rant on data caps killing The Cloud [confusion.cc]; I do think there is an opportunity for service providers in The Cloud, but it’s not really about them offering anything new or exciting in terms of technology. It’s about utility. The thing that the service providers have that over-the-top (OTT) players, like Apple, Google and Microsoft, don’t have is how close they are to the consumer. For my data to get to Apple or Google or Microsoft it has to traverse the service providers network and then some backbone providers network before ending up in some Microsoft, Google or Apple data center half way around the world. On the other hand The Cloud operated by my service provider is just down the road (in internet terms). This is where the opportunity lies.

If I was a service provider I’d put together a cloud service that was designed around using that advantage. Rather than trying to be the be-all-end-all provider of the content itself — a nasty low margin business (which has sidetracked me before [confusion.cc] — I’d be the best cloud for the consumers. Since I’m close and own the network, transmission quality is within my control for streaming media. So I’d sell the customer a cloud service that allowed unlimited upload, download and streaming of any data they want; I don’t care where it came from. My cloud cost you a flat rate and you can do what you want with that data over my network. At the same time there is still a cap on your out-of-network data traffic, so using someone else’s cloud could cost you, and if you want to stream a lot of data it could cost you a lot. One more thing that is needed to make this work, at least for me, is a guarantee that I can take my media back out as easily as I can put it in, so there is not data lock-in only the typical commercial lock-in of a contract.

This is the cloud service I want – open (in terms of where I buy the content does not matter; unlimited upload/download and streaming, high speed and good quality. I would pay for that.

Social Graphing for fun and profit

Saturday, April 23rd, 2011

The whole ‘iSpy’ issue (iPhone’s logging your location — see here [confusion.cc]) reminded me about the data. What good is the data?

According to Gizmodo;

Security expert, Kevin Mitnick says he’s “Quite shocked and disturbed” by the revelation, noting that the logged data could be of great interest to a variety of entities—prying spouses, private investigators, and, he reckons, the government. He speculates that the existence of the log itself “could have been at the request of the government,” as such data “can’t be used for advertisements. It seems to me more to be a governmental request.”

Gizmodo [gizmodo.com]

The story has defused somewhat since a few people have suggested that the logging of location data is a bug [gizmodo.com].

But… let’s say it’s not a bug. Lets say it’s invitational. Let’s go further and say that there are similar files showing who you called and who you messaged. All of this can be correlated with the timestamps so we can see who you called, when you called them and where you were. Now Apple has the same data that your phone service provider has about you (well, they have billing address too if you’re not pre-paid. The again Apple most likely has a credit card on file for iTunes or the App Store so they know where you live too…)

Why would someone want all this data? I said it was most likely for advertising before. But Mitnick says that can’t be what it’s for. I disagree. First of all because location is one of the basic data points for traditional ad selling; Age, Sex and Location or ASL is the triumvirate of advertising. It’s the minimum info you need to attract advertisers. So if Apple could get your Age and Sex — maybe from your credit card data — and combine that with your location (I know that your credit card gives them an address but they can make a more detailed determination of where you actually frequent from the log data than just your home address. For example; if you live in Brooklyn but are actually in Manhattan from 8AM to 8PM every day then maybe your a better target for Starbucks in Manhattan than Einstein Brothers Bagels in Brighten Beach.)

The second and more compelling reason I think the data could be good for advertising is related to Social Graphs. A Social Graph is basically a digital representation of you, the people you know, the people they know and so on. Facebook, and all social networks are Social Graphs. And the reason Facebook launched Places is because it can add location to the graph. And every additional data point added to the graph allows it to profile users better and sell more targeted advertisements. The better the targeting the more it can charge for ads.

Facebook’s Social Graph is founded on the friends that each user has. Then Facebook adds additional layers of data on top of this; everything your ‘Like’, every place you check in to, etc. etc. All of this is used to provide a richer set of profiling data to improve the targeting of ads. But all of it is based on who you say your ‘friends’ are. This is the Explicit Social Graph.

There is another type of Social Graph however, the Implicit Social Graph. This would be a Graph built up not by who you say your ‘friends’ are but by who you actually interact with. This Graph would be developed not by asking you but by observing you, and while hiring a PI to follow everyone around would be expensive there are more passive ways of getting this data. Your phone service provider knows who you call and message and who calls and messages you, as well as were you where any time your phone is turned on. This data could be used to create an Implicit Social Graph showing who you actually interact with in the real work better than who you ‘friend’ online. This Implicit Social Graph could be augmented by other data in the same way that Facebook augments their Social Graph and for the same purpose, better profiling; better advertising.

So maybe Apple is not using the location data and it’s all just a bug. But I think they will want it if they can get it, and they want those call logs and messaging logs. Once they build their Implicit Social Graph for you they will augment it with purchase data from iTunes and maybe Safari Browser history and any other data point they can get no matter who trivial it seems. All to sell more ads.

One final note; To get this data Apple would have to jump through some hoops; collecting it on the handset and sending it back to them from time to time. And I don’t doubt that they or some one else will do it at some point. Your phone service provider has the data already, it’s a byproduct of providing your mobile phone service. They don’t seem to be doing anything with it. I’ve seen several project discussed over the past few years about how to use it, how to create these Explicit Social Graphs and sell advertising, but I am not aware of any that have come to fruition yet. I think it’s only a matter of time till someone like Apple beats the phone companies to the prize. As usual the culture of phone companies will get in the way and they will let another revenue stream slip past them because they just can’t do it. They’re too risk averse, to cheap and to old-fashion. Silicon Valley is going to have their lunch and the ISPization of the phone companies will be one step closer.

The stalker in your pocket part two

Saturday, April 23rd, 2011

Stalker in our pockets

What’s the difference between the image on the left and the image on the right?

The image on the left is the recently posted map [gizmodo.com] of the data that is being stored in your iPhone (and your computer where that your iPhone syncs to). That data amounts to all the locations you have taken your phone since you upgraded it to iOS 4.

The image on the right is basically the same type of data — though it’s presented as an animation so you only see one spot in the image above. That data is from the your phone company — and it does not matter what phone you have, just having a phone on the network is enough for the operator to collect the data, and in many places they are required by law to keep this data for some period of time. (The map on the right also shows all the calls and messages to and from the phone; in this case stripped of the details but be assured the raw data that your phone company has does show who you are calling, I wonder if Apple is creating a log of this data on your iPhone too? I bet they are.) I ranted about this map a while back [confusion.cc].

There seems to be a lot of concern about the fact that your iPhone is logging this type of data. The FCC want’s to know why. Congress wants to know why. (See here [politico.com]). But there does not seem to be anywhere near as much concern about the fact the your phone company has the same data, more detailed data in fact. There should be. In fact, if privacy is your concern, or fear of Big Brother, you should be much more concerned about what your phone company knows than what Apple might know.

The big difference to me between the two is that the historical data that Apple is collecting is on the device and backed up to your computer. While the data that is collected by your phone service provider is on their servers and therefore subject to Lawful Intercept. According to Wikipedia Lawful Intercept is:

obtaining communications network data pursuant to lawful authority for the purpose of analysis or evidence.

Wikipedia [wikipedia.org]

That means that all that data; including locations, calls made, calls received, messages sent and received, as well as who those calls and messages where to or from, is available to law enforcement if needed. This is generally a good thing; if it helps to catch murders or sexual predators or other criminal types. It’s not hard to image it being used for less savory purposes like tracking dissidents or in more authoritarian places tracking political opponents or protesters. This is the kind of data that warrant-less wire tapping was collecting, and it’s done by just making a request to your phone service provider—if the provider or the government is good enough they could collect this data in real time. Meaning we are all carrying around Big Brother approved “bugs” in our pockets.

It’s also worth noting that the data collected by your phone company is required for it to provide the service you are paying for. There has been speculation about what Apple wants this data for; I imagine it will come down to advertising or something, some way to make more money off of iPhone owners; in the end Apple is a company interested in making money. In this case consumers will quickly forget the issue while privacy advocates piss into the wind about for much longer.

The cloud is useless

Tuesday, April 12th, 2011

What good is the cloud? I don’t get it. this article on PC Mag [pcmag.com] talks about how all the new cloud services will change our concept of content ownership but I think it’s bullshit. I don’t disagree with anything in the article but I think it’s all a dream, a crack dream, until one issue is solved. One issue which is outside the scope of the cloud service providers: bandwidth!

At the same time as we are seeing all these new cloud services providing us storage and access to our purchased content 24/7 streaming to any device, anywhere, any time, we are also seeing the death of unlimited bandwidth. Even for home access. How am I supposed to stream my content all over the place if I don’t have any bandwidth?

Take this scenario from the PC Mag article:

The parent whose child wants to watch “Dora the Explorer: Big Sister Dora” over and over and over again doesn’t have to own the DVD or even the digital file. Cloud-based ownership and access means that their child can see Dora play big sister at home, on the iPad, in the car, and on mommy’s smartphone. They own the movie or, more likely, have an all-you-can eat subscription service, so each viewing costs nothing except the price of Internet access.

The emphasis is mine, because it’s the part that kills the whole scenario.

I might be a strange consumer by today’s measure — I’ve digitized all my content. I’ve got more then 1200 CDs that I digitized before I started buying digital music; 200+ DVDs that I have digitized and 7 years worth of digital photos and video that alone amount to more then 12 gigs worth of keepers. All in all I have more than a terabyte of digital content. All happily sitting on my 8TB NAS server mirrored and stripped high up on the shelf in the back room.

To get streaming access to all this content today I can jump through a bunch of hoops and make it work. But… I would max out my mobile data plan every month — 12GB — due to my daughter streaming Dora, and Toy Story 3 and Kai-Lan and whatever new, or old, show it was this week (actually currently it’s My Neighbor Totoro), to the iPad while we are driving or shopping or wherever. So for now she is restricted to the content that is actually on the device, and I fill up the devices quickly. I can’t even put all the Pixar movies on the iPad and have the family photos on there, 64GB is just not enough.

While I may be the exception today this will be normal one day when every piece of content we ever buy is stored on the cloud, ready for on-demand download or streaming to any device over any network. But until the bandwidth issue is solved it will be any network accept the mobile one and only till the service provider throttles me or cuts me off for exceeding my bandwidth cap for the month. Bottom line; the scenario from the PC Mag article is pointless without unlimited bandwidth. Memory is cheap — bandwidth is the new memory.

Why cloud backup for your mobile will not be provided by your operator

Thursday, March 31st, 2011

This article [zdnet.com] and several others making the rounds in the past few days point to Microsoft re-branding the cloud backup service it included with its’ short lived Kin line of mobiles. The cloud backup – Kin Studio – was the coolest feature of the Kin phones, maybe not the most sexy but the most useful. Now it looks like Microsoft may add it to Windows Phone 7 handsets – if they combine it with the Windows Live service, providing 25GB of free cloud storage connected to the users Hotmail/Windows Live and Office Live accounts then they may have a compelling offer.

Of course Microsoft is not the only mover, Apple has long had its’ MobileMe service which has significant overlap. To date this product has only attracted hardcore Apple fan-boys, but for over a year now there has been a rumor that Apple will drop the subscription fee and include as a free service for all iOS devices (more recently there has been a rumor that Apple will drop the subscription fee to $20 a year, I think maybe it will be free for 1 year with your iOS device and then $20 a year unless you buy a new iOS device). Link this to the rumored iTunes media cloud service that will run out of the billion dollar datacenter Apple has built in North Carolina. Again this could be a very useful service providing automated backup and streaming of all media (movies, photos, music, contacts, messages) to the cloud.

Google wouldn’t have to move very far to offer the same sort of service with Android.

In my time in the telco industry I’ve seen several projects at mobile operators around the world try to provide this type of data backup service. Unfortunately I’m not aware of any that actually succeeded. They died for many reasons — customers not willing to pay for the service, limited features, crippled features, lack of marketing, lack of handset support…

All in all I think the data-backup-as-a-service boat has already set sail and the telcos will be left behind due to their own dithering on how to make money on the offering. The same thing that happened to them with Location Based Services — they could not figure out how to make money on it so they never launched it, the phone makes opened the on-device location services (initially mandated for emergency number calling) to application developers and they figured out how to make money from it well &mdask more or less. So the telcos are left with LBS systems that cost them money but generate no revenue and don’t provide any value even in generating ‘customer stickiness’.

C’est la vie. Real consumer service innovation in the mobile market continues to move away from the telcos and towards the internet. It’s one more step on the road to mobile dumb pipe networks.

Mobile Ad Serving

Monday, February 22nd, 2010

I’ve seen a lot of AdMob [admob.com] ads in iPhone applications over the past two years. But recently I downloaded two free — i.e. ad supported — applications that appear to use a Google ad server (of course Google purchased AdMob but I don’t think it’s just a re-branding I think it’s a different service.)

I noticed this for two reasons. First; for the past few years I have been working in the mobile advertising space, so this stuff stands out. Second; the ads were odd.

Why odd? Well, as I mentioned these were ads in iPhone applications. Keep that in mind and take a look at the ads:

Google mobile ads banner for SingTel Mobile SME services
Google mobile ads banner for Android
Google mobile ads banner for Nokia 5530 Apps Store
Google mobile ads banner for Nokia E63 Apps Store

The issue, I think, with these ads is they are all for competitors. I can excuse the SingTel ad, I’m a Starhub customer but this is an add for enterprise services and the ad was served while I was on a WiFi network, so there is not much chance that Google could have determined my operator.

However, the ad for Android and the two ads for the Ovi store — which sells applications that work on Nokia handsets are not useful to me as a consumer and most likely a waste of the advertisers money. The odds that I am going to patronize either of these services from my iPhone is next to zero.

And there is little excuse for this. When I was working on requirements for a mobile advertising system the product team was adamant that it include basic relevance filtering. Now relevance filtering is complex and the simple business rule “the advertisement should be relevant and useful to the consumer” actually breaks down to a lot of technical requirements. The technical requirements of significance here are:

Requirement 1

The system shall attempt to retrieve the User-Agent header from the HTTP request. The header should be used to reduce the pool of relevant banners by removing banners that:

  1. have a User-Agent whitelist that does not include the User-Agent retrieved from the HTTP request
  2. have a User-Agent blacklist that does include the User-Agent retrieved from the HTTP request
Requirement 2
They system shall allow buyers to construct a whitelist or a blacklist of User-Agent strings which for specific campaigns or banners in a campaign.
Requirement 3
User-Agent lists (white- or black-) should be constructed of strings entered by buyers by selecting full User-Agents or pre-coded regular expression from a list or entering an arbitrary regular expression.

(Obviously there are a lot more details and other requirements that need to be clarified before you can actually implement this.) Here’s a use case that would prevent the issue of the Ovi banners being shown to me on my iPhone.

Use Case:
Setting up a campaign level whitelist, to filter out non-Nokia handsets.
Pre-Condition:
The buyer has created a campaign
The buyer selects Relevance Filtering from the interface
Post-Condition:
A new filter is added to the campaign level whitelist for the selected campaign
Scenario:
  1. The Buyer selects Create/Edit Whitelist from the Relevance Filtering menu.
  2. The System loads the Relevance Whitelist interface
  3. The Buyer selects the Filter Campaign checkbox
  4. The Buyer clicks on the Add Filter button
  5. The Buyer selects the Manual Filter filtering method
  6. The Buyer enters *nokia* in the Manual Filter textbox and clicks the Case Insensitive checkbox
  7. The Buyer clicks on the Test Filter button
  8. The System displays a list of all matching User-Agent Strings, highlighting the match(s)
  9. The Buyer clicks on Confirm Filter button
  10. The System adds the filter to the campaign level whitelist in the Database for the selected campaign

So now the Buyer (in this case Nokia or an agent acting on Nokia’s behalf to setup the ad campaign) has created a campaign level whitelist (i.e. all banners in the campaign will be filtered by this whitelist) which includes a filter of *nokia* that is case insensitive. This means that, based on the requirements enumerated above, that no ad request that includes a User-Agent string that does not includes the word nokia in it will be served any banner from this campaign. The effect? Ovi store ads will only be shown to users who are using Nokia handsets (or who’s requesting user agent does not include a User-Agent string or is using an incorrect User-Agent string that includes the word nokia.

Lets look at two use cases for requesting an ad. One where the requesting handset is Nokia N95 and one where it is an iPhone 3GS.

Use Case:
A Apple iPhone 3GS makes an Ad Request from an application running the Ad Server SDK.
Pre-Condition:
The application is using the provided Ad Server SDK
The application makes an ad request
Post-Condition:
A banner is served
Scenario:
  1. The application sends a well-formed HTTP GET Request to the Ad Request Handler URL including an Ad Request payload and the Device User-Agent Header (Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0 like Mac OS X; en-us) AppleWebKit/528.11 (KHTML, like Gecko) Version/3.1.1 Mobile/7A238j Safari/525.20)
  2. The Ad Request Handler Thread retrieves the Ad Request Payload and the HTTP User-Agent Header constructs an Ad Request, sets it’s state to New and pushes the Ad Request onto the Inbound Ad Request Queue
  3. The Ad Request Handler Thread registers with the Ad Dispatcher to wait for it’s Ad.
  4. The Ad Request Processing Thread pops the next Ad Request off of the Inbound Ad Request Queue
  5. The Ad Request Processing Thread checks the Ad Request state
  6. The Ad Request Processing Thread finds the Ad Request state is New
  7. The Ad Request Processing Thread pushes all ads in its Recycle Ad Queue onto the Active Ad Queue
  8. The Ad Request Processing Thread sets the Ad Request state to Unfulfilled
  9. The Ad Request Processing Thread pops the first add off the Active Ad Queue
  10. The Ad Request Processing Thread checks the selected ad for White- and Black- lists
  11. The Ad Request Processing Thread finds an active Campaign Level User-Agent Whitelist
  12. The Ad Request Processing Thread attempts to match each string in the Campaign Level User-Agent Whitelist against the Device User-Agent String in the Ad Request
  13. The Ad Request Processing Thread finds no match for the string *nokia* (case insensitive) in the Device User-Agent String;Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0 like Mac OS X; en-us) AppleWebKit/528.11 (KHTML, like Gecko) Version/3.1.1 Mobile/7A238j Safari/525.20
  14. The Ad Request Processing Thread rejects the ad for this request placing it on the Recycle Ad Queue
  15. The Ad Request Processing Thread pushes the Ad Request back on the Inbound Ad Request Queue
  16. The Ad Request Processing Thread monitors the Ad Request Queue
  17. The Ad Request Processing Thread pops the next Ad Request off of the Inbound Ad Request Queue
  18. The Ad Request Processing Thread checks the Ad Request state
  19. The Ad Request Processing Thread finds the Ad Request state is Unfulfilled
  20. The Ad Request Processing Thread pops the first add off the Active Ad Queue
  21. The Ad Request Processing Thread checks the selected ad for White- and Black- lists
  22. The Ad Request Processing Thread finds no active White- or Black- lists for the ad
  23. The Ad Request Processing Thread selects the Ad for this Ad Request attaches the Ad to the Ad Request and updates the Ad Request state to Pending
  24. The Ad Request Processing Thread pushes the Ad Request onto the Outbound Ad Request Queue
  25. The Ad Request Processing Thread pushes all ads in its Recycle Ad Queue onto the Active Ad Queue
  26. The Ad Request Processing Thread Monitors the Ad Request Queue
  27. The Ad Dispatcher pops the Ad Request off the Outbound Ad Request Queue
  28. The Ad Dispatcher reads the Ad Handler Thread Id from the Ad Request and passes the Ad Request to the corresponding Ad Handler Thread
  29. The Ad Dispatcher Monitors the Outbound Ad Request Queue
  30. The Ad Handler Thread parses the Ad Request and retrieves the ad
  31. The Ad Handler Thread constructs an HTTP Response with the Ad Response Payload including the Ad
  32. The Ad Handle Thread sends the HTTP Response to the requesting application

A few explanations/notes:

The looping nature of the Ad Request Processing Thread makes it hard to write an efficient and clear Use Case but I’m not getting paid to do this (any more) so I’m not going to spend the time to do the work to make it clearer.

This use case leaves out details of filters other than the Whilelist filter, such as Logged In/Logged Out Ad Queue selection, Ad Unit size, etc. for simplicity.

There would be multiple Ad Request Processors/Threads running, so when the first thread that retrieved the Ad Request and found the request was in the New state it pushed all the ads in it’s internal Recycle Ad Queue back onto the Active Ad Queue because those banners still need to be served and may be suitable for this Ad Request. The second time the Ad Request Processing Thread finds the Ad Request in the Unfulfilled state it does not empty it’s Recycle Ad Queue to avoid endlessly looping over the same unsuitable ad. (Writing this I think I need to think more about this—there could be a condition where a thread only ever sees Ad Requests in the Unfulfilled state and would therefore never empty its Recycle Ad Queue…)

I’ve included a lot of stuff that would actually go to other use cases and just be referenced as Include X Use Case, again I hope this makes it clearer (at least to the techies) what is happening.

Anyway… Here is what would happen when an Nokia N95 made a request:

Use Case:
A Nokia N95 makes an Ad Request from an application running the Ad Server SDK.
Pre-Condition:
The application is using the provided Ad Server SDK
The application makes an ad request
Post-Condition:
A banner is served
Scenario:
  1. The application sends a well-formed HTTP GET Request to the Ad Request Handler URL including an Ad Request payload and the Device User-Agent Header (Mozilla/5.0 (SymbianOS/9.2; U; Series60/3.1 NokiaN95/10.0.018; Profile/MIDP-2.0 Configuration/CLDC-1.1 ) AppleWebKit/413 (KHTML, like Gecko) Safari/413)
  2. The Ad Request Handler Thread retrieves the Ad Request Payload and the HTTP User-Agent Header constructs an Ad Request, sets it’s state to New and pushes the Ad Request onto the Inbound Ad Request Queue
  3. The Ad Request Handler Thread registers with the Ad Dispatcher to wait for it’s Ad.
  4. The Ad Request Processing Thread pops the next Ad Request off of the Inbound Ad Request Queue
  5. The Ad Request Processing Thread checks the Ad Request state
  6. The Ad Request Processing Thread finds the Ad Request state is New
  7. The Ad Request Processing Thread pushes all ads in its Recycle Ad Queue onto the Active Ad Queue
  8. The Ad Request Processing Thread sets the Ad Request state to Unfulfilled
  9. The Ad Request Processing Thread pops the first add off the Active Ad Queue
  10. The Ad Request Processing Thread checks the selected ad for White- and Black- lists
  11. The Ad Request Processing Thread finds an active Campaign Level User-Agent Whitelist
  12. The Ad Request Processing Thread attempts to match each string in the Campaign Level User-Agent Whitelist against the Device User-Agent String in the Ad Request
  13. The Ad Request Processing Thread finds a match for the string *nokia* (case insensitive) in the Device User-Agent String;Mozilla/5.0 (SymbianOS/9.2; U; Series60/3.1 NokiaN95/10.0.018; Profile/MIDP-2.0 Configuration/CLDC-1.1 ) AppleWebKit/413 (KHTML, like Gecko) Safari/413
  14. The Ad Request Processing Thread selects the Ad for this Ad Request attaches the Ad to the Ad Request and updates the Ad Request state to Pending
  15. The Ad Request Processing Thread pushes the Ad Request onto the Outbound Ad Request Queue
  16. The Ad Request Processing Thread pushes all ads in its Recycle Ad Queue onto the Active Ad Queue
  17. The Ad Request Processing Thread Monitors the Inbound Ad Request Queue
  18. The Ad Dispatcher pops the Ad Request off the Outbound Ad Request Queue
  19. The Ad Dispatcher reads the Ad Handler Thread Id from the Ad Request and passes the Ad Request to the corresponding Ad Handler Thread
  20. The Ad Dispatcher Monitors the Outbound Ad Request Queue
  21. The Ad Handler Thread parses the Ad Request and retrieves the ad
  22. The Ad Handler Thread constructs an HTTP Response with the Ad Response Payload including the Ad
  23. The Ad Handle Thread sends the HTTP Response to the requesting application

This type of White- and Black- list filtering would prevent me from seeing ads for the Ovi store on my iPhone (assuming the people provisioning the ads or campaigns used, and used correctly, the filtering options mentioned above — but that’s a business problem not a technical one.)

The ad serving space is complicated but have been surprised to see a number of examples like this on my phone over the past few days. In fact most of the ads I receive in the applications that use the Google system over the AdMob system seem to fall into this category. I have notices a lack of relevant ads in general for Singapore in the Google Web system. On some sites I see very relevant ads; global companies doing brand building or selling online. But on a lot of sites, and I mean a lot I see the same ads for the same service—a Singapore specific company so at least the location filtering is working. Or I see public service ads (invariably for Kiva [kiva.org].) So either there are not enough relevant ads for Singapore or these local guys are spending a lot of money to spam everyone.

The people at Google are a lot smarter than me, so I wonder if there is something I am missing in all this? Were my product people wrong? Are the users who are creating the campaigns not using some feature of the Google system that would filter these ads from me? Am I completely nuts? Do I have too much time on my hands?

Concerning Color Algorithms

Thursday, July 30th, 2009

Long ago I worked on an interface for a reporting system that displayed transaction volume data as a large table; days on one axis and hours on the other.  To improve the readability of this densely packed data we wrote a coloring algorithm to change the background color of the table cell.  The algorithm colored each cell in relation to every other cell using the minimum and maximum values of the displayed data.  The effect was quite good.  Users could instantly see patterns in the data without having to read the actual number in the cells.  This data display, though simple was one of the main selling points of the system.  End-users like the sexy packaging, the unseen technology that actually makes the system a wonder is only of interest to the developers—unless of course it does not work.

I spent a lot of time improving the UI of the system as a project once, and when my CTO asked me what I was going one day I said “optimizing the color algorithm”—that was a mistake.  I didn’t sleep for the next 6 months as I was assigned the new, high priority, has-to-be-done-now project.  For a new technology, that I had no idea how it worked.  It was fun, but that’s a story for another day. The reason I raise this here is that I never finished the intended work on the color algorithm. One of the major goals that I never got to was to fix the color range. The algorithm as implemented ended up with colors that transitioned from purple to yellow. I wanted to improve the algorithm to output colors that transitioned from red through yellow to green. Never got done.

Recently I was working on a project that required some visual UI work. One of the elements was a bar chart that displayed volume over time. So I used the same-old, same-old color algorithm (in PHP):

/* 
* getColor()
* $max := the highest value in the set to be colored.
* $min := the lowest value in the set to be colored.
* $c := the current item's value.
*
* Returns a string; the HEX representation of an RGB color.
*/
function getColor( $max, $min, $c) {
$j=0;
$r=0;
$g=0;
$b=0;

$j = $max != $min ? ( $c - $min ) * 20 / ( $max - $min) : 0;
$j = $j > 20 ? 20 : $j;

$r = 5 + $j;
$r = $r > 15 ? 15 : $r;

$g = 5 + $j/2;
$g = $g > 15 ? 15 : $g;

$b = 10 - $j/2;
$b = $b < 0 ? 0 : $b;

return sprintf( "%01x%01x%01x%01x%01x%01x", $r,$r,$g,0,$b,0 );
}

Simple right? Well using that you end up with this:
Old getColor output

So I started scratching my itch to fix the colors.

My first thought was to just ask a question over at StackOverflow [stackoverflow.com]. As I was typing in my question the system suggested existing questions… and since it’s an SO faux pas to repeat questions I started looking at them. It didn’t take long before I found two questions that seemed to be pointing me in the right direction: 1. Generate colors between red and green for a power meter [stackoverflow.com] and 2. Color scaling function [stackoverflow.com].

Reading the answers to those questions it took about 30 minutes to do this:

/* 
* getColor()
*
* $max := the highest value in the set to be colored.
* $min := the lowest value in the set to be colored.
* $c := the current item's value.
*
* Returns a string; the RGB representation of a color (eg. "rgb(255, 255, 255)").
*/
function getColor( $max, $min, $c) {
   $h = (.33 / ($max - $min)) * ($c - $min);
   $s = .8;
   $v = 1;

   $rgb = hsv2rgb($h, $s, $v);

   return(sprintf("rgb(%d,%d,%d)", $rgb['r'], $rgb['g'], $rgb['b']));
}

/* 
* hsv2rgb()
*
* $h := the hue, normalized to 0-1.
* $s := the saturation, normalized to 0-1.
* $v := the value, normalized to 0-1.
*
* Returns an associative array containing the red ('r'), green ('g') and blue ('b') values in the range 0-255.
*/
function hsv2rgb ($h, $s, $v){
   $rgb = array();

   if($s == 0){
      $r = $g = $b = $v * 255;
   } else {
      $var_h = $h * 6;
      $var_i = floor( $var_h );
      $var_1 = $v * ( 1 - $s );
      $var_2 = $v * ( 1 - $s * ( $var_h - $var_i ) );
      $var_3 = $v * ( 1 - $s * (1 - ( $var_h - $var_i ) ) );

      if ($var_i == 0) { $var_r = $v ; $var_g = $var_3 ; $var_b = $var_1 ; }
      else if ($var_i == 1) { $var_r = $var_2 ; $var_g = $v ; $var_b = $var_1 ; }
      else if ($var_i == 2) { $var_r = $var_1 ; $var_g = $v ; $var_b = $var_3 ; }
      else if ($var_i == 3) { $var_r = $var_1 ; $var_g = $var_2 ; $var_b = $v ; }
      else if ($var_i == 4) { $var_r = $var_3 ; $var_g = $var_1 ; $var_b = $v ; }
      else { $var_r = $v ; $var_g = $var_1 ; $var_b = $var_2 ; }

      $r = $var_r * 255;
      $g = $var_g * 255;
      $b = $var_b * 255;
   }

   $rgb['r'] = $r;
   $rgb['g'] = $g;
   $rgb['b'] = $b;

   return $rgb;
}

New getColor output

That’s great! Now maybe I’ll play with the hue bounds, the saturation and the value, but it looks fairly good on my first try.

Moral of the story: this is what StackOverflow is for; getting an answer to a question without even asking it… Because you are not re-inventing the wheel, someone had your problem before, and maybe someone found the answer already. SO, thanks to Paul Dixon [stackoverflow.com] (also here [blog.dixo.net]) and ΤΖΩΤΖΙΟΥ [stackoverflow.com] for their answers!