Visualize Ethereum ERC20 Token Data Using Amazon Managed Blockchain Query and Amazon QuickSight

·

Understanding the behavior and distribution of ERC20 tokens on the Ethereum blockchain is essential for businesses issuing digital assets, such as stablecoins. Companies like Paxos, which issue USD-backed tokens such as PYUSD, need actionable insights into metrics like top token holders, daily active users, transaction volume, and usage across decentralized finance (DeFi) platforms. Fortunately, combining Amazon Managed Blockchain (AMB) Query with Amazon QuickSight enables organizations to extract, process, and visualize Ethereum token data efficiently—without managing complex blockchain infrastructure.

This guide walks through how to use AWS services to build a scalable pipeline for analyzing and visualizing ERC20 token data, empowering teams to make data-driven decisions in real time.

Why Use AMB Query for ERC20 Token Analytics?

Amazon Managed Blockchain Query simplifies access to public blockchain data by offering serverless APIs that return standardized, finalized blockchain records. This eliminates the need to run full nodes or maintain indexing systems—common pain points when working with Ethereum data.

Real-Time, Finalized Data Access

To track dynamic metrics such as daily active users, transaction volume, and latest transfers, you need access to up-to-date and finalized blockchain events. AMB Query’s ListTransactions and ListTransactionEvents APIs deliver exactly that.

For example, by calling ListTransactions with the contract address of a token like PYUSD (0x6c3ea9036406852006290770BEdFcAbA0e23A0e8), you retrieve all transactions involving that token. Then, using ListTransactionEvents, you can drill down into individual transfers—identifying sender, receiver, amount, and timestamp—for both ERC20 tokens and native ETH.

👉 Discover how easy it is to start tracking blockchain activity today.

Historical Insights into Token Holders

Understanding long-term trends requires historical context. The ListTokenBalances API allows you to capture snapshots of all token holders at a given point in time. This is crucial for monitoring changes in holder concentration, identifying top holders, and calculating the total number of unique addresses holding the token.

Unlike manual ETL processes that require custom indexing and storage solutions, AMB Query delivers historical balance data via REST APIs—reducing development time and infrastructure overhead.

Cost-Effective and Scalable Architecture

Processing blockchain data at scale can be expensive due to high computational and storage demands. AMB Query operates on a pay-as-you-go model, allowing you to query only the data you need when you need it. With predictable pricing based on API call complexity, businesses gain full cost transparency while avoiding the burden of maintaining blockchain nodes.

Building the Visualization Pipeline

To turn raw blockchain data into insightful dashboards, we integrate several AWS services into a seamless workflow:

Let’s explore how each component fits into the solution.

Step 1: Set Up an S3 Bucket

All extracted data will be stored in Amazon S3. Create a bucket via the AWS Console following Amazon S3’s official guide. This bucket will serve as the central data lake for your token analytics pipeline.

Step 2: Extract Data Using AWS Glue Jobs

AWS Glue runs serverless Python scripts to call AMB Query APIs and store results in S3. Two primary jobs are used:

Capture Ongoing Token Transfers

Create a Glue job named token-transfers using the token-transfers.py script from the official GitHub repository. Configure it with these parameters:

Schedule this job to run hourly so your analytics reflect near-real-time activity—ideal for tracking daily volume and active addresses.

Snapshot Token Holder Distribution

Use the token-snapshot.py script to run a daily job (token-snapshot) that captures current token balances. Schedule it once per day to monitor shifts in holder distribution and detect potential centralization risks.

Step 3: Query Data with Amazon Athena

Once data lands in S3, use Athena to create external tables for analysis.

Run this SQL to create the events table (for transfers):

CREATE EXTERNAL TABLE events(
 contractaddress string,
 eventtype string,
 from string,
 to string,
 value string,
 transactionhash string,
 transactiontimestamp string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',')
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://your-bucket/pysud/events'
TBLPROPERTIES ('classification'='csv', 'skip.header.line.count'='1')

And this for the token_snapshot table:

CREATE EXTERNAL TABLE token_snapshot(
 address string,
 balance string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',')
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://your-bucket/pysud/snapshot'
TBLPROPERTIES ('classification'='csv', 'skip.header.line.count'='1')

These tables enable powerful SQL-based analysis of token movement and ownership patterns.

Step 4: Visualize in Amazon QuickSight

Connect QuickSight to Athena and import the events and token_snapshot datasets.

Create a Holder Distribution Pie Chart

  1. Select the token_snapshot dataset.
  2. Create a calculated field for numeric balance:
    cast(balance as INTEGER)
  3. Choose a pie chart visualization.
  4. Add address as a dimension and the new balance_int as value.
  5. Clean up labels by truncating addresses:
    concat(substring(address, 1, 6), '...', substring(address, 39, 4))

This produces a clean, professional-looking chart showing top holders.

👉 See how real-time analytics can transform your digital asset strategy.

You can expand this dashboard with line charts for daily transfers, bar graphs for top DeFi protocols used, and tables listing recent transactions—fully automated and updated hourly or daily.

Frequently Asked Questions

Q: Can I use this solution for tokens other than PYUSD?
A: Yes. Simply replace the contract address in the Glue job parameters with any ERC20 token’s smart contract address on Ethereum.

Q: Is this method suitable for high-frequency trading analytics?
A: While the hourly update frequency works well for business reporting, ultra-low-latency use cases may require more frequent polling or alternative architectures.

Q: How secure is my data in this pipeline?
A: All AWS services used support encryption at rest and in transit. IAM roles ensure least-privilege access control across components.

Q: Can I automate dashboard sharing with stakeholders?
A: Yes. QuickSight supports scheduled email reports and embedded dashboards for internal or public sharing.

Q: Does AMB Query support blockchains other than Ethereum?
A: Currently, AMB Query supports Ethereum mainnet and select testnets, with potential expansion to other chains in the future.

Q: What happens if a Glue job fails?
A: AWS Glue integrates with Amazon CloudWatch for monitoring and alerting. You can set up retries or notifications for failed runs.

Final Thoughts

By leveraging Amazon Managed Blockchain Query, AWS Glue, Athena, and QuickSight, businesses can build robust, automated pipelines to analyze ERC20 token data without deep blockchain expertise. Whether you're monitoring stablecoin circulation, assessing community engagement, or auditing DeFi integrations, this architecture delivers timely, accurate insights.

With cloud-native scalability and minimal operational overhead, it's never been easier to bring transparency to blockchain-based assets.

👉 Start exploring blockchain analytics with powerful tools today.