The Graph Indexer Experience

Summary

Based on the research collected here, I created a framework for an Indexer Operational Dashboard. Research included going into The Graph's Indexer channel, reviewing the Discords of Indexers, reviewing Indexer Office Hours, and talking with individual Indexers.

The infrastructure and data flows is complex:

To zoom out in terms of the business flywheel:

However, to more fully understand the opportunities and challenges, I widen the lens which opens up questions about a broader potential set of considerations for Indexers.

Market Dynamics

The Graphs's work token $GRT value rises as the amount of queries increases.

As more valuable data from more Networks becomes available, more dApps and more users of those dApps drives the demand for queries. This demand applies an upward pressure on the value of $GRT.

Indexers contribute to the ecosystem by serving data against queries run on their infrastructure. So while they do not directly drive demand, Indexers impact the search experience through the accuracy and performance of their services.

While this document looks at the problems and goals of the Indexers, we need to do so in light of the overall value to the ecosystem and the token.

Personas

Core Persona - Professional Node Operator

Goals are to:

scale servers to
deliver query quality with
the lowest latency and
highest stability while
reducing costs in order to
generate the highest returns so they can
attract and retain delegators to increase their stake on subgraphs to
further increase their profits

Many are also values-aligned with the principles of decentralization.

However, it appears that Indexing is a profession, and they often have a portfolio of Networks they support.

Possible Adjacent Personae

Are there other personas we need to explore?

Possible reasons:

Other personas could drive demand for queries
Developers of Subgraphs may want a way to also become Indexers, but not full-time
Developers of dApps may want to also become Indexers if doing so benefits their dApps

I haven't encountered any that fit these alternative personas.

However, keeping in mind the broader ecosystem, with a focus on growing the ecosystem and its value, these should be explored.

As different tech-driven markets have evolved, different personas have emerged:

"Easy-button" persona that wants to enter a market but with a much lower technical barrier in order to trade off performance. Examples: CloudFlare prosumer, consumer plug n play miners, casual gaming
"Super seller" persona often is a power law player that is good at marketing: eBay super sellers, Uber fleet managers, e-commerce aggregators

These are just possible explorations of other adjacent personas who could also be Indexers.

For the rest of this document, I focus on the Core Persona.

Primary Jobs to Be Done (JTBD)

Indexers' range of tasks and responsibilities fall into two large buckets: Infrastructure Operations (DevOps) and Financial Operations (RevOps).

Infrastructure Operations (DevOps)

Infrastructure operations refers to the "caring and feeding" -- the DevOps -- of the nodes. Professional node operators are experts in this area.

Unlike traditional DevOps who support a continuous deployment of code bases and code fixes, currently Indexers manage infrastructure for their Subgraphs' query end-points. While that end point is hit by different applications, their service doesn't support new applications being deployed.

To service the end-point for queries and indexing of networks, they care about:

scaling servers
delivering quality
reducing latency
increasing stability

problems they face

infrastructure fails or has performance issues
set ups are complex and time-consuming
monitoring and troubleshooting is complex

drill downs

how do I determine the architecture?

I want both operating guidance and planning insights on whether to run one really heavy node, or spread indexing infrastructure out across the world.
I want to make the right decisions on the hardware and cloud vendors

how do I configure resources ?

I want to know how much throughout or other resources to allocate per service (eg graphql query, indexing)

how do I deploy easily?

I want to quickly deploy from local, testnet, to mainnet
I want to scale up and deploy to the right infrastructure

how do I monitor and troubleshoot my infrastructure?

A core part of the devops is being alerted and having the right telemetry. Current tools present some limitations:

I want to limit or filter actions from CLI to reduce noise and unnecessary paginations
I want to receive relevant alerts to fix priority issues
I want to easily use tools like Prometheus and Grafana with minimal set up
I want a usable and flexible visual dashboard to run my indexing operations (allocations, rules, network)

how do I monitor the subgraphs I am indexing?

I want to know what a subgraph has been deprecated to reallocate

how do I scale my infrastructure?

As data increases, I want to scale with ingestion volume.
I want to shard my data without adversely affecting performance.

Financial Operations (FinOps)

The operators run their Indexers as a business.

As such, they have the same considerations: maximize and grow profit.

In order to do this, they want to:

reduce costs
optimize allocations
maximize rewards
attract and retain delegators

problems they face

optimizing allocations is confusing or manual or opaque
it takes time to market to delegators
operating infrastructure is costly and sometimes not covered by rewards

drill downs

how do I manage costs?

When I establish or modify my infrastructure, how do I ensure I spend the least to get the maximum output?
Time is part of my cost: how do I streamline Infrastructure Operations to reduce time spent reacting?

how can I better define the cost models?

When I set up the cost models, how do I decide what model sense for me?
I want to know the performance of a given cost model

how can I continue to monitor and improve cost models?

Once I have established models, how do I know whether I need to make adjustments based on demand and resource utilization?

how can I attract and retain delegators?

How can I provide compelling customer service?
How can I differentiate myself to appeal to delegators?

how can I optimize my indexing decisions?

I want to have control over these decisions
I recognize the complexity and that there are AI tools for this, but don't know how much to trust a black box
This needs to be an area of differentiation
I worry that the indexer rewards will dry up

Product Metrics

If overseeing the success of the enabling products, what do we need to impact, how do we measure (in a decentralized environment), and what is a set of activities and strategies?

Tools Adoption

Adoption of tools
NPS-like score of the tools or experience
Frequency of customization (vs it just works)

Success Metrics

Some broad metrics to help direct attention on the success of the Indexers. Note: I think there's a challenge in a decentralized environment to get infrastructure metrics (it might be done already but I couldn't find it). Is this something that is worth acquiring globally?

Performance metrics across infrastructure
- Query performance
- Index performance
Financial return metrics
- Rewards earned by type (e.g indexing rewards, query rewards, MIPS)
- Rewards by Subgraphs
- Average rewards by Indexers
- Average returns for Delegated Stakes

Engagement

Involvement in office hours
Outbound activities ?

Growth

Number of indexers
Correlation of indexer growth on rewards and performance

Research

Tooling

Indexers seem to have their own tooling based on what they are most familiar with. Here is a sampling, which includes tools they have created for themselves.

Questions

Have preferences emerged?

It seems like there are existing containers for running subgraphs and the such. What preferences have formed and why?

Should there be a first-class supported tooling solution for monitoring?

It appears that there are different flavors of monitoring.

Given that there are flexible sources, both open source and SaaS, should there be a recommended tool with pre-built templates and agents that simplifies this part of the Indexer Experience?

The scope of monitoring contains a mix of standard infrastructure telemetry, such as CPU, storage, bandwidth, standard application metrics around indexers, such as query volume and latency, and metrics specific to being an Indexer such as POI discrepancy, actions queue, sync rate to subgraphs.

These tools already seem to be cover some aspect of these.

Would it help to provide out of the box a robust tool that covers the full range?

This could allow Indexers a way to think in the two broad buckets described above -- DevOps and RevOps.

My Initial Hypothesis

While it seems that Indexers are fine with, and perhaps even enjoy, building their own tools, I believe this is because out-of-the-box tools aren't meeting their needs.

This is common, and a reasonable response is to let a thousand flowers bloom.

However, as the number of Indexers grow, providing more standard and supported tools will be appealing because it frees time from building tools to monitor and allows them to focus on DevOps (actual management of infrastructure) and I believe, for those that want to differentiate themselves, RevOps (what can drive higher returns).

I believe Observability is a high-leverage area that can drive common standards.

How an Indexer actually manages and deploys may be bespoke and custom to their backgrounds (although I believe better abstractions can improve it, but it's hard.)

However, we should be able to converge on standard industry metrics and telemetry, while also deeply empathizing with them on The Graph specific metrics. This, combined with improving the tools for root-cause, could free up time while also speeding up the MTTR and overall performance.

Based on this thesis, I took a pass at the Indexer Operational Dashboard.