Google have recently announced on their blog,  Ethereum blockchain data is now available for exploration with BigQuery, Google’s Cloud’s Petabyte-scale data warehousing solution.

BigQuery by Google will make it possible to explore the entire Ethereum blockchain, using the Ethereum’s GitHub source code. The source code will be extracted and entered into BigQuery, making the entire historical data on the Ethereum blockchain readily available. Google is currently seeking new contributors and additional blockchains.

The Ethereum ETL project on GitHub contains all source code used to extract data from the Ethereum blockchain and load it into BigQuery—we welcome more contributors and more blockchains!

Why make Ethereum blockchain data available on Google Cloud?

The purpose of making the Ethereum blockchain data available on Google Cloud is to make all of the data stored on the Ethereum blockchain easily accessible.

Currently, Ethereum’s software does contain “API’s for a subset of commonly used random-access functions (for example: checking transaction status, looking up wallet-transaction associations, and checking wallet balances, for example), API endpoints don’t exist for easy access to all of the data stored on-chain.”

The API endpoints do not make it possible to view ” blockchain data in aggregate” while BigQuery’s OLAP features enable such analysis. The blog then shows a chart displaying “the total Ether transferred and average transaction cost, aggregated by day”


The software built on Google Clod will also be enable the following three features.

  1. Synchronizes the Ethereum blockchain to computers running Parity in Google Cloud.
  2. Performs a daily extraction of data from the Ethereum blockchain ledger, including the results of smart contract transactions, such as token transfers.
  3. De-normalizes and stores date-partitioned data to BigQuery for easy and cost-effective exploration.

Interesting queries and analyses

The blog post then goes on to highlight three interesting points which was picked up from their analysis.

Analysis 1: Popular Smart Contracts Event Logs

The most popular use case of Ethereum is the smart contracts, which enable the creation of new tokens.

“Below we demonstrate querying the dataset’s transactions and contracts tables to find the most popular smart contracts, as measured by transaction count”:


The most popular ERC-721 smart contract by transaction count is 0x06012c8cf97bead5deae237070f9587f8e7a266d, the main smart contract for the CryptoKitties game.

Analysis 2: Transaction Volumes and Transaction Networks

There are thousands of tokens which are in use on the Ethereum blockchain, the patterns of individual distribution vary for every token. By focusing on every token’s transaction activity, Google Cloud is able to measure which are the most popular in aggregate value or a given time frame.

Below is the daily transaction’s for OmiseGo, the 5th most popular Ethereum token

Note that on September 13, 2017, there was a large increase in the number of $OMG receivers but no increase in the number of senders. This corresponds to the beginning of the OmiseGO Token Airdrop.

Analysis 3: Analysis of Smart Contract Functionality

Most smart contracts on the Ethereum blockchain are ERC20, meaning they following rules specifically defined the ERC20 protocol. Most of the functions are specifically related to token transfer, However the full description can be found in the ERC20 Token Standard specification document.

There are many other functions which can be implemented in the smart contracts language, and this has been analysed by Google. Because a lot of the smart contracts have ‘open source’ source code, Google were able to gain knowledge about the specific contracts’ from the functions used within the smart contracts.

Coming back to the CryptoKitties, discussed in Analysis 3 above, the major element of gameplay is animal husbandry, and the mixing of genes in a breeding event is implemented in the CryptoKitties GeneScience smart contract, 0xf97e0a5b616dffc913e72455fde9ea8bbe946a2b. Suppose we wanted to find other games also implementing similar gameplay mechanics to the CryptoKitties GeneScience contract? We can measure this using a JavaScript UDF implementation of the Jaccard similarity coefficient in this query.

“These results reveal that several earlier versions of the GeneScience contract are most similar to the current version of the smart contract at address 0xf97e0a5b616dffc913e72455fde9ea8bbe946a2b. But there are also some others (e.g. CryptoPuppies at 0xb64e6bef349a0d3e8571ac80b5ec522b417faeb6), that appear to be highly similar contracts, as measured by method signatures.”

There full article can be read here

Don’t forget to join our Telegram channel to track our cryptocurrency portfolio