Google Analytics is the most used analytics package in the world. What information Google stores in their cookies is less known. I was researching cookie tracking when thinking of a way to implement a script that tracks conversion attribution for our affiliates. At first sight, Google gives a useful overview of their cookies in their documentation. The thing that’s missing though, is what’s actually stored in the cookies. Google sets four to six different cookies with cryptic names like ‘utma’ and ‘utmz’. Each cookie is used for different purposes. I will explain each purpose and the data the cookie holds.
__utma, identifies unique visitors
The first (at least in Google’s naming scheme) cookie is used to identify unique visitors. This cookie is set to expire 2 years after each update. An example of this cookie:
- 79104832: domain hash, unique for each domain (reused across all cookies)
- 870834247: unique identifier
- 1300982179: timestamp for the time you first visited the site
- 1302517735: timestamp for the previous visit
- 1302528749: timestamp for the start of current visit
- 59: number of sessions started The combination of the unique id and the timestamp of the first visit form an unique identifier that Google uses to identify different visitors. Timestamps are stored using the Unix time format, the number of seconds passed since January 1st, 1970.
utmb and utmc, identify sessions
These cookies are used to determine sessions. The first one expires 30 minutes after inactivity and the second once expires after the browser is closed. Together they are able to identify unique sessions. When you close your browser but come back within 30 minutes, GA knows to record a new session, because the __utmc cookie was killed. This way they can record new sessions that start within the 30 minute limit.
- 79104832: domain hash
- 3: number of pageviews in current session
- 10: This starts at 10 on every site. Each time you click on an outgoing link it will count down until it reaches 0. It’s part of an outgoing links tracking system in the ga.js that never appeared in the GA interface (thanks André Scholten for pointing this out in the comments)
- 1302530994: timestamp for the start of the current session
__utmz, identifies traffic sources
This cookie is where the meat is. It holds information about the way you entered the website. Stuff like keywords used in Google, campaigns like AdWords or affiliate marketing are all saved here. It’s set for six months, so visits can be attributed to a certain campaign or source for up to six months. It is overridden each time the user enters the site from a new source.
In this case I entered Eduhub through Google’s organic search results with the keywords ‘human quality management eduhub’.
- 79104832: domain hash
- 1302099445: timestamp when cookie was set
- 45: session number, number of sessions you’ve had on the website
- 9: campaign number, number of different campaigns used to enter the site
- utmcsr: campaign source, what source was used to enter the website
- utmccn: campaign name. Campaign names usually differ per unique campaign, this variable makes each different campaign measurable
- utmcmd: campaign medium. Organic in this case, other common mediums are referral, cpc and email
- utmctr: campaign terms. In this case the keyword I last used to enter Eduhub (human quality management eduhub) Google also uses utmv and utmx, for custom variables and Website Optimizer, respectively.
What about privacy?