Skip to main content

Data Source

Code

The data layer, known as the DaaS, serves as the source for all time series data for the price simulator. This includes historical data and volume values.

The code provides an interface to query historical price data of crypto assets from Google BigQuery database. The get_prices_for_assets function is the main entrypoint and accepts a list of assets, a block range, and granularity, then retrieves price data for each asset in the list. The code leverages Web3 to interact with the Ethereum blockchain, Pandas for data manipulation, and Google's BigQuery client for data querying.

This document covers the API endpoints available in the Data-as-a-Service (DaaS) controller. These endpoints allow you to retrieve historical price data, available assets, and date/time information from the DaaS.

DaaS Endpoints

Get Historical Prices

POST /daas and POST /daas/prices

Fetches historical price data for specified assets within a given block range or lookback period.

Input

  • assets (required): An array of asset identifiers for which to retrieve prices.
  • block_lookback (optional): The number of blocks to look back from the current block. Only used if start_block is undefined. Defaults to a pre-defined value.
  • start_block (optional): The starting block number for the price data retrieval. If unspecified, the lookback period is used.
  • end_block (optional): The ending block number for the price data retrieval. If unspecified, the latest block is used.
  • data_feed (optional): The data feed source to use. Defaults to biglake.
  • granularity (optional): The granularity of the price data in seconds. Defaults to 0, indicating block-level data.

Output

A JSON object where each key is a string representation of the asset identifier, and the value is another object containing arrays of data corresponding to each column in the DataFrame.

Get Assets

GET /daas/assets

Retrieves a list of all available assets from the specified data feed.

Input

  • data_feed (optional): The data feed source from which to retrieve assets. Defaults to biglake.

Output

A JSON object with a single key assets, containing an array of available assets in the data source.

Get Min/Max Datetimes

GET /daas/datetimes

Fetches the minimum and maximum datetimes for the specified assets from the given data feed.

Input

  • assets (optional): An array of asset identifiers for which to retrieve datetime information. Defaults to an empty array if not specified, indicating that all assets should be used.
  • data_feed (optional): The data feed source from which to retrieve datetime information. Defaults to biglake.

Output

A JSON array where each element is a JSON object representing a record from the dataframe, with datetime fields formatted in ISO 8601 format. Clients can use pd.read_json to convert this JSON output to a usable Pandas DataFrame.

Internal Modules and Classes

  • AssetPair: This is a basic class used to pair an asset with its trading pair.
  • LRUCache: An imported class that sets up a Least Recently Used (LRU) Cache. This data structure has a specified maximum capacity and, when full, it removes the least recently used items first.
  • bigquery: Google Cloud's BigQuery Client for interacting with the BigQuery service.
  • Web3: A Python library for interacting with Ethereum.

Internal Functions

  • get_block_timestamp: Retrieves the timestamp of a specific Ethereum block.
  • get_all_days_between: Given a start and end date, this function generates a list of all days between those two dates (inclusive).
  • datetime_to_str: Converts a list of datetime objects to a list of strings in the YYYY-MM-DD format.
  • query_table: Performs a BigQuery SQL query and returns the result.
  • get_prices: Fetches the closing price data for a list of asset pairs over a specific block range. This is done by querying a Google BigQuery table for each date in the provided block range.
  • get_prices_for_assets: Main function to retrieve asset prices. This function is responsible for preparing the cache, retrieving prices, and handling missing values.
  • get_min_max_datetimes: Retrieves the minimum and maximum datetimes for a list of assets (or all assets if empty) from a specified data feed. It constructs a list of AssetPair objects from the input assets and then queries a data table based on the data_feed parameter. Returns a DataFrame with the datetime information for each asset.
  • get_all_assets: Fetches all available assets from a specified data feed. This function queries a data table without requiring asset input, useful for obtaining a comprehensive list of assets. Returns a DataFrame enumerating each available asset for the given data source.
  • format_asset, build_cache_key, get_asset_symbol: Helper functions used in the main get_prices_for_assets function.

How to Use for Prices

The get_prices_for_assets function is the main entrypoint for fetching asset prices. Here is an example usage:

# Define list of assets
assets = [{'symbol': 'ETHUSDT'}, {'symbol': 'BTCETH'}]

# Define start and end block numbers, or lookback period is the start block is none
start_block = 10000000
end_block = 10010000
block_lookback = 0

# Define the data source from which to query
data_feed = 'biglake'

# Define granularity (0 means the block level resolution will be used)
granularity = 1

# Fetch prices
prices = get_prices_for_assets(assets, start_block, end_block, block_lookback, data_feed, granularity)

In this example, we are fetching the prices of Ethereum (ETH) in terms of USDT and Bitcoin (BTC) in terms of ETH between block numbers 10,000,000 and 10,010,000, with a granularity of one second (i.e., all prices will be returned, as the data layer currently has a maximum granularity of one second resolution).

The asset input should be a list of dictionaries, where each dictionary has a symbol key associated with the symbol of the asset. The start_block and end_block inputs define the range of blocks for which to retrieve price data. The granularity input determines the frequency of the output data.

The output is a dictionary where each key is an asset symbol and each value is a Pandas DataFrame containing the price data for that asset. The DataFrame contains Date and Close columns, where Date is the date of the price data and Close is the closing price of the asset on that date.

Note that the start and end blocks do not need to be specified. If the end block is None, then the latest block will be used. If the start block is None, then the block lookback will be substracted from the end block (and if the block lookback is None, then the default lookback value of 100000 as defined in the DEFAULT_BLOCK_LOOKBACK constant will be used).

How to Use for Asset Listing

The list of assets for which the data layer has data can be retrieved using the get_all_assets function. Here is an example usage:

data_feed = 'biglake'
get_all_assets(data_feed)

How to Use for Datetimes

The minimum and maximum datetimes for which the data layer has data can be retrieved using the get_min_max_datetimes function. Here is an example usage:

assets = [['ETH', 'USDT']]
data_feed = 'biglake'
get_min_max_datetimes(assets, data_feed)

If you want to get the minimum and maximum datetimes for all assets, you can pass an empty list to the assets input.