Tablet PC Thoughts

Tuesday, July 18, 2006

Google knows who you REALLY are!


It's always fun to learn whole new layers of technology. What I'm posting about here is probably known by a lot of people, but my recent involvement in two new start-up companies has really started to have me think about the breadth and depth of data mining occurring on the Internet involving personal behavior and habits. And one of the largest harvesters of all of that personal information is Google. There are already others who cover this much better than I ... Google Watch is one ... however I still wanted to blog about this.

Two of the four start-ups that I am now involved in are working on web applications - hosted services - that want to provide new levels of social and affiliate networks. With one start-up we are creating a new form of video advertising on the net, with a full affiliate marketing network behind it. So it becomes important to track when affiliates (bloggers or web sites that host the ads) cause sales to occur. When that happens they get paid a commission. With the other start-up we are creating a new interactive media type that can be spread virally through web sites, e-mail and IM. With this solution we want to be able to track and map the viral spread to acknowledge and reward the people who are able to cause the most spread.

As my teams and I began to build both of these solutions we began to examine how other vendors are accomplishing the same things. We have now looked at dozens of implementations, and then created our own solutions that we believe will give us what we are after. While doing this I began to see a pattern that is an amazing wealth of personal information that Internet users are giving away about themselves ... about who they REALLY are. On one of the largest consumers of all of this personal behavioral information is Google. It's really the scale of their ability to gather this data that caused me to pause and think.

It all starts with a cookie



In doing some research into how ot track consumers, I was surprised to find that most people agree that 99%+ of web browsers operate using the default settings when it comes to cookies. Cookies are the small pieces of data that a web site can pass down to your web browser, and from then on - until the cookie expires - that data is passed back to the web site every time that you access it. Cookies can be defined to last for a very short amount of time - just that particular session - or a very long amount of time ... decades, or even hundreds of years.

So when you first visited Google ... the very first time ... you got your first Google cookie. And this is a good starting point ... when did YOU lose your Google-virginity? When exactly was that first time? Google knows. Even if you have changed computers, browsers, upgraded, etc. there is a chance that Google still knows. They know the year, day, hour, minute, and second. You were given the mark of Google. Ok ... big deal ... so what.

Tracking what you search



The first thing they are now able to do is track every single search that you perform on Google. Lots of people know about this, and understand this is the case. They also know the time of day, day of the week, phase of the moon, weather conditions, popular news, and even the popularity of that particular search when you did it! So what searches do you tend to do late at night during a full moon? Ask Google ... they know!

In my opinion, it's not really the details of what you searched that have the real value ... it is when you did them, and in what sequences, and what other patterns emerge about you. This is where your true identity begins to emerge. What? You were on-line searching on a Friday night? Not out with friends?

Proliferation of AdWords



Ok ... now this next part is where I started to really think. While working on how to dynamically inject video advertising into a web page, I found that Google is using a very interesting technique for AdWords and Google Analytics. Again ... it's very simple and easy, and many people know this ... however many people do not. And the implications are very interesting.

If you have a web site, and you choose to place AdWords on your web site, Google will give you a nice little bit of HTML to embed in your page. That HTML includes a script tag that will fetch a snippet of Javascript code from Google's servers. The Javascript then causes the AdWords ads to be rendered within your web page. It's actually pretty impressive that when I browse to your website, without being told a thing, my browser will automatically load your page and go and load the script from Googles servers. Clean ... transparent. Ok yeah ... and when it did that ... the Google cookie went with that request. Remember the Google cookie?

Yes ... now it's not just the searches that you do on Google's web site that are being tracked, but also every single web page that you visit that contains Google AdWords!

Tracking what web sites you visit

Google is now notified by your browser any time that you visit a web site that hosts Google AdWords ... and it only gets better. Google recently announced Google Analytics. This is a service that allows web site owners to get detailed analysis of the traffic to their web site, and about the visitors to their web site. Any web site owner who wants this impressive reporting can simply request that Google give them an account. When approved, Google will provide access to the Google Analytics web site, and there you get ... another little bit of HTML to put into your web pages. The little snippet again requests a script from Google, and of course passes along your cookie!

So now Google knows what you search, and what sites you visit that have AdWords, and now any site that uses Google Analytics. I'm digging to find figures to understand just how much of the Internet now falls into this category, but it is a large number of sites. And just like the searches, Google not only knows what web sites you have visited, but at what time, in what order. Combined with their broad indexes of Internet content, they have the ability to categorize those sites. Combined with all other types of data they can really begin to get an idea of just who you are, what you do and when, on the Internet. I really begin to wonder what some of the patterns must look like.

If Google knows your real identity also ...

Now ... they know you by your cookie, but do they really know who you are? Well, if you choose to use any number of Google services - gMail, AdWords, AdSense, etc. - then the answer is yes! In most cases, you join these services and begin to disclose personal information that just might be a solid connection to the real you. And remember, each time you use these services that nice little Google cookie ensures that they know it's you. Closing the loop. Connecting the dots.

Lastly ... your friends? Well ... Google now knows via gMail who you communicate with, and at what intervals and times. They now know the type of people that your friends and contacts hang out with. Google knows that YOU are the type of person that all of these people communciate with. From their e-mail address they might even draw the direct connection to yet another person who they have collected all of the data about ... from their Google cookie. I haven't really spent too much time thinking about how much deeper all of this goes ... however it makes sense why Google wants all the storage and bandwidth they are building out. It's not about providing search to you ... it's about owning a perspective of you that no one else on the planet could recreate right now.

Google knows you like no one else. Google knows more about you and I then we know about ourselves. Google will use this to provide us what we really want ... right? Google will do no evil ... right? Google would never use this data to use us ... to manipulate our undistinguished behaviors ... right? The Internet is here, and some things appear to be inevitable ...

Google knows who you REALLY are.