Medium’s Data Logs

Medium
Medium Policy
Published in
2 min readOct 3, 2016

--

At Medium, we maintain two types of logs: server logs and event logs.

Server logs: Like most internet companies, our servers automatically record each request made when someone visits our sites uses our apps. We have two types of server logs: web proxies and application. These server logs may include your web request, IP address, browser type, browser language, the date and time of your request, and one or more cookies that may uniquely identify your browser. We will delete all server logs after 30 days or earlier.

Here is an anonymized example of a web proxy log entry for a user who views a post:

01.01.01.01 — — [10/Apr/2014:18:04:24 +0000]+0.018 “GET /p/xxxxxxxxxxx HTTP/1.1” 410 2467 “http://www.google.com" “Mozilla/5.0 (Windows; U; Win 9x 4.90; SG; rv:1.9.2.4) Gecko/20101104 Netscape/9.1.0285” “1010101010101:10x10x10x10x” “-” “medium.com

The parts are as follows:

  • IP address (01.01.01.01)
  • Timestamp + request time ([10/Apr/2014:18:04:24 +0000]+0.018)
  • HTTP request, method + path + HTTP version (“GET /p/xxxxxxxxxxx HTTP/1.1”)
  • HTTP status returned (410)
  • Response length in bytes (2467)
  • Referrer (http://www.google.com)
  • User agent (Mozilla/5.0 (Windows; U; Win 9x 4.90; SG; rv:1.9.2.4) Gecko/20101104 Netscape/9.1.0285)
  • Internal transaction ID (1010101010101:10x10x10x10x)
  • Medium client identifier (“-”)
  • Host (medium.com)

Event logs: Our event logs record user actions on the site, such as clicking through stories or scrolling. Event logs do not contain IP addresses, user names, user addresses, or user email addresses. They do contain user IDs generated by Medium, as well as descriptions of actions users take on the site. We may keep event logs indefinitely.

Here’s an example of an anonymized event log entry for a user who views a post:

{“tags”: {“usergroup”: “1”},”isAuthenticated”: false,”userId”: “lo_10101010101”, “id”: “xx0101010101”,”type”: “emit”,”client”: “web”,”createdAt”: 1397152749513,”reportedAt”: 1397152747909,”name”: “post.xoxoxoxoxo”, “value”: 1,”data”: {“location”: “https://medium.com/matter/22979c8ec9d6", “referrer”: “http://tech.slashdot.org/submission/3475293/are-the-deaf-being-silenced?sdsrc=rel","userId": “lo_10101010101”, “collectionSlug”: “matter”, “postId”: “22979c8ec9d6”}}

The parts are as follows:

  • tags: arbitrary tag about the user
  • usergroup: arbitrary grouping for users
  • isAuthenticated: whether the user was logged in
  • userId: the user that performed the event
  • id: unique event id, internal-only
  • type: internally used to handle processing the event
  • client: device type
  • createdAt: timestamp when the event was processed
  • reportedAt: timestamp when the event was reported by the client
  • name: identifier for the type of event
  • data: arbitrary metadata, different for each event type
  • location: url where the event happened
  • referrer: http referrer
  • userId: the user that viewed the post
  • collectionSlug: the collection the post was in when it was viewed
  • postId: the post that was viewed
Unlisted

--

--