How Google Read Your Gmail?

how google read your gmail

If someone asking you to give him (or her) the key to your home so that this person can come in to your house to inventory what you purchased in exchange for a free service, will you accept it? That basically what Gmail is asking for...

How the hell Google know my purchase on Amazon?

On a recent sunny weekend morning, I log into my Google Account to change some of my account settings, when I done with it, I had this "I feel lucky" moment that I casually click on the "Payments and subscriptions" on the left side of the Google Account dashboard, and one of my purchase from Amazon a few days ago shown up as my "Purchases" activities, my first reaction was "WTF, how the hell Google know my purchase on Amazon"? the purchase was not involved Google in any way for the entire transaction, except the order confirmation email was sent to my Gmail inbox, but isn't the email transport protocol encrypted and secured?. The only way that Google know about this transaction has to be from reading (technically parsing) my Gmail. I further drill down into the details and click on the little 'i' on the purchase information, A popup came up bluntly shown "This purchase was found in your Gmail".

google read gmail to get purchase info

Giving Google the key to my home

My second thought was "oh my god, did I grant Google the rights to read my gmail"? When I signed up to Gmail somewhat 15 years ago in 2004, I was aware that it is a "Free, advertising-supported webmail service", that was how it was marketed and what I signed up for that I have to accept those annoying advertisements aside my webmail screen. I honestly don't recall that there was this clause of letting Google to read my mails, or whether Google has quietly changed the Privacy Policy and add it in (they certainly taken away the company mottos of "Don't be evil") during the past 15 years?

As far as I can tell, Google isn't interested on your private emails, but it particularly interested on 4 areas of your purchase habits and activities:

  1. Payments and Transactions that you did through Google Search, Map and Google Assistant;
  2. Any purchases that they could find on your Gmails or Google Assistant records;
  3. Any subscription services that you subscribed to;
  4. Any reservations you made, whether it is hotel, flight, or restaurant, and events, etc,.

Of course, if you are using Google Pay, I'm sure every single records would be logged. By default, those data will be kept by Google for 18 months, you have two other options to either only allow Google to keep 3 months of data, or to keep the data until you manually go and delete it. There is no option available for disable Gmail from collecting your purchase data. You will have to delete the email that related to the purchase, or not use Gmail for receiving such purchase confirmation!

A quick check on Google privacy policy page shown that, by signing up to a Google service, you basically grant the "consensus" to allow Google to track your activities, including your activities with any third-party sites or apps that using Google service (in this case, Gmail).

google privacy policy page

Lately I've been thinking about this a lot and I wish that I knew Gmail's behaviour earlier, the best analogy that I could think of Google's way of asking for "consensus" is this, Google is asking not only allow it to peek through my windows, but to hand over the key of my home, so that Google can come in any time, inventory my refrigerator/storage, or search for any purchase receipts that I casually tucked into my wallet, in exchange for a free gmail service. I think if Google explicitly asking for this, not just me, I guess most of the people would think this is too much and it has crossed the boundary, and won't accept it.

How Google store the data about my purchases?

One thing strikes me is that the Amazon purchase isn't the only purchase email that I have in my Gmail folders. It seems that a few of my namecheap.com purchase/subscription confirmations are not reflected on Google's radar, but a Lazada (An Alibaba invested e-commerce player in South East Asia) purchase in Jan 2018 is shown on my Purchase activities, but the data was only partially scraped by Google (more on this later).

I decided to find out what else Google know about me by downloading all the data that Google kept about me (Go to https://myaccount.google.com/data-and-personalization, and scroll down and click on Download your data). Select all the Google products (51 of them) and proceed for the download. You can choose to break down the size of the archive into multiple download packages between 1GB to 50GB.

Interestingly, Google called your request for download as Takeout, like they know that if you are doing this, you are ready to take out your data and leaving Google. Anyway, I consider myself a lite-user of Google products, My main usage of Google products are Google Maps, YouTube, Gmail and Google Analytics. It took Google about 15 minutes to gather all the information and send me the links for downloading the Takeout, mines is about 7.8GB all together.

my google takeout data

Is Amazon sharing data with Google?

The two purchases that in my Gmail are stored in json format under a folder called "Purchases _ Reservations". The one of Amazon purchase is extremely details.

Amazon order as a json file on Google's record


{
  "merchantOrderId": "D01-xxxxxxx-xxxxxxx",
  "creationTime": {
    "usecSinceEpochUtc": "1558445164000000",
    "granularity": "MICROSECOND"
  },
  "transactionMerchant": {
    "name": "Amazon.com"
  },
  "lineItem": [{
    "purchase": {
      "status": "ACCEPTED",
      "unitPrice": {
        "amountMicros": "2990000",
        "currencyCode": {
          "code": "USD"
        },
        "displayString": "$2.99"
      },
      "returnsInfo": {
        "isReturnable": true,
        "daysToReturn": 30
      },
      "landingPageUrl": {
        "link": "https://www.amazon.com/gp/r.html?C\u003dO5ID2HJ90BI3\...(url shorted for posting)...u003dpe_385040_118058080_TE_M1DP"
      },
      "productInfo": {
        "name": "RUST: Programming Language. (name shorted for posting)",
        "description": "Kindle Edition"
      }
    },
    "name": "RUST: Programming Language. (name shorted for posting)"
  }],
  "priceline": [{
    "type": "SUBTOTAL",
    "amount": {
      "amountMicros": "2990000",
      "currencyCode": {
        "code": "USD"
      }
    }
  }, {
    "type": "TAX",
    "amount": {
      "amountMicros": "0",
      "currencyCode": {
        "code": "USD"
      },
      "displayString": "$0.00"
    }
  }]
}

amazon purchase confirmation email

I noticed that one thing odds about this json data is that it contains something that I couldn't find on my Gmail, that is the information about returnInfo, it simply does not exist on the email body, I further took a look at the gmail source by login to https://www.gmail.com web page, open the Amazon purchase email, and click on the "More" dropdown button on the right side next to the email header, and click on "Show original". Again, nowhere in the html email source body that I could find the information about returnable. How on earth that Google could get such information if it is simply doesn't exist on the Gmail message? No one could have the returnInfo except Amazon, is Amazon sharing my purchase data with Google?

My other purchased-related emails

My Lazada purchase in Jan 2018 (somewhat 18-month ago) was not sent to my Gmail address, I somehow forwarded the receipt to my Gmail inbox and therefore it is found in my Gmail archive. Based on the Google Takeout data, it looks more like an incomplete or unsuccessful parse of the order.

lazada order confirmation email

Lazada order as a json file on Google's record


{
  "merchantOrderId": "303913333",
  "creationTime": {
    "usecSinceEpochUtc": "1515436099000000",
    "granularity": "MICROSECOND"
  },
  "transactionMerchant": {
    "name": "lazada.sg"
  },
  "lineItem": [{
    "purchase": {
      "status": "SHIPPED",
      "fulfillment": {
      }
    }
  }]
}

Interestingly, when I view the source of Lazada email, I found that there is a json file embedded at the html <head> section wrapped in a <script type="application/json"> tag. Based on the content, it should be from PayPal about my payment transaction, it has rich information about me, my name, billing address, telephone number, total amount (no breakdown) of the payment transaction, merchant name (i.e. Lazada) and Lazada order page URL, etc., but somehow Google was not captured this "above the fold" data, this seems to suggested that Google is actually parsing the email body using some sort of scraping tool to get the information out. But it still does not answer the question on where Google get the returnable information on Amazon's order...

During the period between my Lazada order in Jan 2018 and Amazon order in May 2019, I have several namecheap.com orders in my Gmail folder, but Google somehow did not parsed those orders, one thing I noticed though is that namecheap.com orders are base 64 encoded, but that should be very easy to decode it to get the html source code. Maybe namecheap.com is too small or its products are too niche for Google to pay attention to it...

What else Google know about me

Ever since I discovered that Google Chrome software updater behave like a Malware more than a year ago, I have stopped using Chrome, and only kept a copy on my computer for my development test purpose without allowing Google to automatically updating the Chrome. I also switched from Google Search to DuckDuckGo about the same time. So Google Takeout has no data about my search and browser history (Good!). I don't have any Android phone, so no records on Google Play, Calendar, Task, etc., too.

My biggest usage of Google services are Gmail, Google Maps and YouTube, plus Google Analytics(for this web site). Google do have my where-about captured from my use of Google Maps that dated back to about 8 months ago. Every time when you launch Google Maps either on your desktop or mobile phone, Google captured a snapshot of your location and stored the data.

Location data that Google captured when you use Google Maps


{
    "timestampMs" : "151xxxxxxx061",
    "latitudeE7" : 3xxxxxxxx,
    "longitudeE7" : 1xxxxxxxxx,
    "accuracy" : 5,
    "velocity" : 0,
    "heading" : 210,
    "altitude" : 77,
    "verticalAccuracy" : 8
}

I'm also a big YouTube user as I have a YouTube channel that consists of 30+ of my cycling videos that I created. So very big portion of the 7.8GB Takeout download are due to those videos. YouTube by far is the most pervasive data capture and tracker among all Google services, it consists a record of every video I watched and every search I made on YouTube dated back to 5 years ago! Of course, it has all the comments that I made on YouTube platform since the first comment and every YouTube channel that I subscribed to.

youtube data on google takeout download

I will suggest that you go and download your TakeOut and see what Google know about you and adjust your Google Privacy settings accordingly.

Personally, I want my home key back!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.