• Home
  • About
  • Repositories
  • Search
  • Web API
  • Feedback
<< Go Back

Metadata

Name
Transaction Graph Dataset for the Ethereum Blockchain
Repository
ZENODO
Identifier
doi:10.5281/zenodo.4718440
Description
This dataset contains ether&nbsp;as well as popular ERC20 token transfer transactions extracted from the&nbsp;Ethereum Mainnet blockchain.&nbsp;

Only send ether, contract function call, contract deployment transactions are present in the dataset. Miner reward (static block reward) and &quot;uncle block inclusion reward&quot; are added as transactions to the dataset. Transaction fee reward and &quot;uncles reward&quot; are not currently included in the dataset.&nbsp;

Details of the datasets are given below:&nbsp;&nbsp;

FILENAME FORMAT:

The filenames have the following format:

eth-tx-&lt;start blockno&gt;-&lt;end blockno&gt;.txt.bz2&nbsp;

where &lt;start blockno&gt; is the starting block number and &lt;end blockno&gt; final block number.&nbsp;

For example file eth-tx-1000000-1099999.txt.bz2&nbsp; contains transactions from&nbsp;

block 1000000 to block 1099999&nbsp; inclusive.&nbsp;

The files are compressed with bzip2. They can be uncompressed using command bunzip2.&nbsp;

&nbsp;

TRANSACTION FORMAT:

Each line in a file corresponds to a transaction. The transaction has the following&nbsp;format:

&lt;SYMBOL&gt; &lt;blockno&gt; &lt;tx no&gt; &lt;from addr&gt; &lt;to addr&gt; &lt;value&gt;&nbsp;

&nbsp;

&lt;SYMBOL&gt;&nbsp; &nbsp; is the abreviation for the asset.&nbsp; For example ETH means ether transfer in Wei&nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;units.&nbsp; ERC20 tokens transfers (transfer and transferFrom function calls in ERC20&nbsp;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; contract) are indicated by token symbol. For example GUSD is Gemini USD stable

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; coin. The JSON file erc20tokens.json given below contains the details of ERC20 tokens.

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Failed transactions are prefixed with &quot;F-&quot;.

&lt;blockno&gt;&nbsp; &nbsp; &nbsp; &nbsp;Number of the block which contains the transaction.&nbsp;

&lt;tx no&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Position of the transaction in the block (i.e. transaction number in the block). Static block reward and uncle block inclusion reward has 0 as tx no. The position of transactions in the block is shifted by 1. The first transaction has 1 as tx no.

&lt;from addr&gt;&nbsp; &nbsp;Source ethereum address of the transfer. For static block rewards, the from addr is ETHMAINBLOCK. For uncle block inclusion rewards, the from addr is ETHMAINUNCLE.

&lt;to addr&gt;&nbsp; &nbsp; &nbsp; &nbsp;Destination ethereum address of the transfer. The to addr is 0x0 for contract deployment transactions instead of null.

&lt;value&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Amount of transfer. The amount is given as hexadecimal a number.&nbsp;

&nbsp;

BLOCK TIME FORMAT:

The block time file has the following&nbsp;format:

&lt;block no&gt; &lt;timestamp&gt;

&nbsp;

&lt;block no&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Number of the block.&nbsp;

&lt;timestamp&gt;&nbsp; &nbsp;&nbsp; Unix timestamp at which the block is mined as a hexadecimal number.

&nbsp;

erc20tokens.json FILE: &nbsp;

This file contains the list of popular ERC20 token contracts whose transfer/transferFrom transactions&nbsp; appear in the data files.

ERC20 token list: USDT TRYb XAUt BNB LEO LINK HT HEDG MKR CRO VEN INO PAX INB SNX REP MOF ZRX SXP OKB XIN OMG SAI HOT DAI EURS HPT BUSD USDC SUSD HDG QCAD PLUS BTCB WBTC cWBTC renBTC sBTC imBTC pBTC

&nbsp;

IMPORTANT NOTE:

Public Ethereum Mainnet blockchain data is open and can be obtained by connecting as a node on the blockchain or by using the&nbsp; block explorer web sites such as&nbsp;http://etherscan.io&nbsp;. The downloaders and users of this dataset accept&nbsp;the full responsibility of using the data in GDPR compliant manner or any other regulations. We provide the data as is and we cannot be held responsible for anything.
Data or Study Types
multiple
Source Organization
Unknown
Access Conditions
available
Year
2021
Access Hyperlink
https://doi.org/10.5281/zenodo.4718440

Distributions

  • Encoding Format: HTML ; URL: https://doi.org/10.5281/zenodo.4718440
This project was funded in part by grant U24AI117966 from the NIH National Institute of Allergy and Infectious Diseases as part of the Big Data to Knowledge program. We thank all members of the bioCADDIE community for their valuable input on the overall project.