Data Source
The data layer, known as the DaaS, serves as the source for all time series data for the price simulator. This includes historical data and volume values.
The code provides an interface to query historical price data of crypto assets from Google BigQuery database. The get_prices_for_assets
function is the main entrypoint and accepts a list of assets, a block range, and granularity, then retrieves price data for each asset in the list. The code leverages Web3 to interact with the Ethereum blockchain, Pandas for data manipulation, and Google's BigQuery client for data querying.
This document covers the API endpoints available in the Data-as-a-Service (DaaS) controller. These endpoints allow you to retrieve historical price data, available assets, and date/time information from the DaaS.
DaaS Endpoints
Get Historical Prices
POST /daas
and POST /daas/prices
Fetches historical price data for specified assets within a given block range or lookback period.
Input
assets
(required): An array of asset identifiers for which to retrieve prices.block_lookback
(optional): The number of blocks to look back from the current block. Only used ifstart_block
is undefined. Defaults to a pre-defined value.start_block
(optional): The starting block number for the price data retrieval. If unspecified, the lookback period is used.end_block
(optional): The ending block number for the price data retrieval. If unspecified, the latest block is used.data_feed
(optional): The data feed source to use. Defaults tobiglake
.granularity
(optional): The granularity of the price data in seconds. Defaults to0
, indicating block-level data.
Output
A JSON object where each key is a string representation of the asset identifier, and the value is another object containing arrays of data corresponding to each column in the DataFrame.
Get Assets
GET /daas/assets
Retrieves a list of all available assets from the specified data feed.
Input
data_feed
(optional): The data feed source from which to retrieve assets. Defaults tobiglake
.
Output
A JSON object with a single key assets
, containing an array of available assets in the data source.
Get Min/Max Datetimes
GET /daas/datetimes
Fetches the minimum and maximum datetimes for the specified assets from the given data feed.
Input
assets
(optional): An array of asset identifiers for which to retrieve datetime information. Defaults to an empty array if not specified, indicating that all assets should be used.data_feed
(optional): The data feed source from which to retrieve datetime information. Defaults tobiglake
.
Output
A JSON array where each element is a JSON object representing a record from the dataframe, with datetime fields formatted in ISO 8601 format. Clients can use pd.read_json
to convert this JSON output to a usable Pandas DataFrame.
Internal Modules and Classes
AssetPair
: This is a basic class used to pair an asset with its trading pair.LRUCache
: An imported class that sets up a Least Recently Used (LRU) Cache. This data structure has a specified maximum capacity and, when full, it removes the least recently used items first.bigquery
: Google Cloud's BigQuery Client for interacting with the BigQuery service.Web3
: A Python library for interacting with Ethereum.
Internal Functions
get_block_timestamp
: Retrieves the timestamp of a specific Ethereum block.get_all_days_between
: Given a start and end date, this function generates a list of all days between those two dates (inclusive).datetime_to_str
: Converts a list of datetime objects to a list of strings in theYYYY-MM-DD
format.query_table
: Performs a BigQuery SQL query and returns the result.get_prices
: Fetches the closing price data for a list of asset pairs over a specific block range. This is done by querying a Google BigQuery table for each date in the provided block range.get_prices_for_assets
: Main function to retrieve asset prices. This function is responsible for preparing the cache, retrieving prices, and handling missing values.get_min_max_datetimes
: Retrieves the minimum and maximum datetimes for a list of assets (or all assets if empty) from a specified data feed. It constructs a list ofAssetPair
objects from the input assets and then queries a data table based on thedata_feed
parameter. Returns a DataFrame with the datetime information for each asset.get_all_assets
: Fetches all available assets from a specified data feed. This function queries a data table without requiring asset input, useful for obtaining a comprehensive list of assets. Returns a DataFrame enumerating each available asset for the given data source.format_asset
,build_cache_key
,get_asset_symbol
: Helper functions used in the mainget_prices_for_assets
function.
How to Use for Prices
The get_prices_for_assets
function is the main entrypoint for fetching asset prices. Here is an example usage:
# Define list of assets
assets = [{'symbol': 'ETHUSDT'}, {'symbol': 'BTCETH'}]
# Define start and end block numbers, or lookback period is the start block is none
start_block = 10000000
end_block = 10010000
block_lookback = 0
# Define the data source from which to query
data_feed = 'biglake'
# Define granularity (0 means the block level resolution will be used)
granularity = 1
# Fetch prices
prices = get_prices_for_assets(assets, start_block, end_block, block_lookback, data_feed, granularity)
In this example, we are fetching the prices of Ethereum (ETH) in terms of USDT and Bitcoin (BTC) in terms of ETH between block numbers 10,000,000 and 10,010,000, with a granularity of one second (i.e., all prices will be returned, as the data layer currently has a maximum granularity of one second resolution).
The asset input should be a list of dictionaries, where each dictionary has a symbol
key associated with the symbol of the asset. The start_block
and end_block
inputs define the range of blocks for which to retrieve price data. The granularity input determines the frequency of the output data.
The output is a dictionary where each key is an asset symbol and each value is a Pandas DataFrame containing the price data for that asset. The DataFrame contains Date
and Close
columns, where Date
is the date of the price data and Close
is the closing price of the asset on that date.
Note that the start and end blocks do not need to be specified. If the end block is None
, then the latest block will be used. If the start block is None
, then the block lookback will be substracted from the end block (and if the block lookback is None
, then the default lookback value of 100000
as defined in the DEFAULT_BLOCK_LOOKBACK
constant will be used).
How to Use for Asset Listing
The list of assets for which the data layer has data can be retrieved using the get_all_assets
function.
Here is an example usage:
data_feed = 'biglake'
get_all_assets(data_feed)
How to Use for Datetimes
The minimum and maximum datetimes for which the data layer has data can be retrieved using the get_min_max_datetimes
function. Here is an example usage:
assets = [['ETH', 'USDT']]
data_feed = 'biglake'
get_min_max_datetimes(assets, data_feed)
If you want to get the minimum and maximum datetimes for all assets, you can pass an empty list to the assets
input.