Request for Discussion: Data Economy Index (DATA)

Glad to hear you believe in the vision and potential for the Data Economy Index!

@Kiba and I felt that it was most important for a given token’s weight in the index to be based on a combination of market value (the market cap weight) and on-chain economic activity relative to market value (the economic weight).

The biggest challenge in creating a methodology for a DATA index at this time is the extreme power distribution of market capitalization and liquidity for the assets (LINK is 90% of circulating market market cap and 94% of liquidity). DPI has this challenge as well, but not to the same extent.

Liquidity weight was added as DEX liquidity for GRT, NMR, and OCEAN was under $5m per token so liquidity could become an issue if AUM exceeds $20m for DATA.

1 Like

@verto0912 Thank you for sharing your feedback with us - these are excellent points for consideration!

@Kiba and I both feel that the Economic Activity Weight (EW) is a valuable part of the Matrix Scoring System because it gives greater weight to tokens with stronger economic fundamentals (i.e. higher on-chain revenues as a percentage of circulating market cap). We do not believe that adding economic activity is confusing as it is fully transparent in the Matrix Scoring System.

In regards to the data sources, the links provided are well established sites where each community tracks data relevant to their markets. We can improve data collection processes for the index over time as the Data Economy matures and develops as an industry.

As I mentioned in my comment to @jdcook, the biggest challenge in creating a methodology for a DATA index at this time is the extreme power distribution of market capitalization and liquidity for the assets (LINK is 90% of circulating market market cap and 94% of liquidity). DPI has this challenge as well, but not to the same extent. Liquidity weight was added to the matrix scoring system as DEX liquidity for GRT, NMR, and OCEAN is under $5m per token so liquidity could become an issue if AUM exceeds $20m for DATA.

We could simply calculate the square of circulating market capitalization, but that would skew the index heavily into a single asset, Chainlink (LINK), and also not take into account fundamental economic activity for a given protocol, which we believe should be an important factor for the index.

For all of the tokens/projects in the Data Economy Index (DATA), data is the product.

For instance, Chainlink Node operators power decentralized price feeds, Graph Indexers create consumable GraphQL endpoints, data owners publish data assets for consumers on Ocean Protocol, and data scientists submit predictions from machine learning models to power Numerai’s hedge fund.

Starting with 4 tokens does feel a bit underwhelming, but we have to start somewhere! We’re extremely early in the development of the Data Economy. I believe that the Index Coop should define and own the category before someone else does. As noted in the post, DeFi Pulse started when Maker dominance of DeFi was ~90% and Lightning Network and Augur were still considered DeFi projects:

Crypto evolves rapidly. It is easy to imagine many additional projects meeting the Token Inclusion Criteria outlined above and being included in DATA. For example, if cross-chain interoperability were a solved problem on Ethereum and Set Protocol then projects like Filecoin and Helium would already be eligible for inclusion.

1 Like

@verto0912 @jdcook In regards to your comments on Intrinsic Productivity (IP):

@Kiba has lots of ideas about Intrinsic Productivity (IP) for DATA, but we did not want to overload initial considerations for the index given that IP is still in development for DPI after much consideration and many spirited debates we have both been a part of. :grinning_face_with_smiling_eyes:

In the original draft, we had a section for Future Considerations, here is what @Kiba wrote for IP:

Intrinsic Productivity (IP): IP will make DATA completely unforkable if done correctly. There are two types of IP possible - protocol level staking and DeFi farming. We will start with DeFi strategies similar to DPI. Once staking becomes available on a protocol and we start staking on it we can provide superior returns to token holders because 1) higher yields with native staking 2) we can give all returns to DATA holders maxing out APY which no other operator can do. This leads to flywheel effect where higher yield gets more TVL → more staking power → more jobs (superlinear unlike L1/DeFi staking) → higher APY → higher TVL. If we reach capacity on native staking we can switch assets to DeFi farming.

1 Like

Just wanted to respond to this @verto0912 . My perspective is that these assets are fundamentally different than the assets in DPI. They perform functions within a specific data economy where data is the overall product. The economic weights can also be seen as a sign of future growth, imo, so you aren’t just giving weight to what is big now, but also to what has strong tokenomics & usage and potential for high growth. I think it is a valuable addition and one that probably only makes sense with the DATA index at the moment.

The point about data collection is very fair. That would have to be transparent at all times.

1 Like

Great work bringing this to the Forum @Thomas_Hepner and @Kiba - clearly a boatload of work has gone into this.

I strongly desire more index products from the Coop which help capture new investment themes in crypto. DPI and MVI do this well and I think DATA could too - though to be honest I would personally be more excited if the product had more tokens (8+) and a broader scope.

There’s been talk of this decentralized middleware index from DFP for sometime with nothing forthcoming - and given that I’d be open to your ‘data’ theme being opened up to cover all decentralized cloud (data, compute, query, graph, etc). I ultimately think decentralized cloud is a more comparable theme to DPI and MVI - DATA being more comparable to decentralized exchanges or decentralized insurers, a category lower.

I’m sure DATA could be quite successful - and maybe we have subcategories captured by our index products in future - but for now, I think the Coop should prioritize a decentralized cloud index, which I think could be many multiples larger and help us build our brand better (regardless of who is the methodologist!).


Hey @DevOnDeFi - appreciate the thoughtful comments! Here is my thinking on some of your concerns:

@DevOnDeFi I hear your concern about wanting more tokens in the DATA index - that is definitely the long-term goal! I think it will happen naturally as the Data Economy grows and matures, just like DeFi did, but there are also some current technological limitations preventing this from happening today.

DATA’s scope already covers the decentralized cloud and much more.

The reason DATA does not include tokens in what you are calling the decentralized cloud (data, compute, query, graph, etc.) has nothing to do with the Token Inclusion Criteria of DATA. Decentralized data storage projects Filecoin ($5.4B ), Siacoin ($0.85B), Arweave ($0.55B), and Akash Network ($0.2B) are all excluded because they are not ERC-20 tokens. This is a technological, not methodological limitation. DATA would already include these projects if Ethereum and Set Protocol supported cross-chain interoperability.

If we wanted to include Filecoin in DATA at launch we would need to use wrapped Filecoin (WFIL) or renFIL (renFIL). Siacoin, Arweave, and Akash Network do not have wrapped or derivative tokens as ERC-20 tokens at the present time.

@DevOnDeFi Do these points address your concerns or do you still have reservations? My main point is that we plan to include decentralized cloud projects in the DATA index when it is technologically feasible to do so.

@Thomas_Hepner and @Kiba, great work for collating so much information for this proposal.
As an engineer, I can see the big potential and demand for decentralized Web3 services because they make the whole ecosystem anti-fragile.

However, having four tokens to start the DATA index is quite risky for an investor, and deges will just buy them separately.

We are still in the early stage of Web3 and it is very hard to guess who will be the winners. I would also prefer to have 8+ tokens in an index for diversification.

I would recommend that we dig deeper on other Web3 ERC20 tokens which have a huge potential. Lastly, I think having a survey on discord about what Web3 tokens they like will give us an idea of what tokens to dig in. I would also be delighted to help you on this journey.

1 Like

Another point on the economic activity component - it limits your potential inclusions. Assume there’s a token that meets all of your criteria (which still looks like you are handpicking tokens) but they don’t have a credible source for revenue. What happens?

If there are only 4 tokens that fit your criteria, what it tells me is that either 1) your criteria are too strict or 2) the space is not mature enough for an index. Or potentially both.

I think trying to expand the universe (maybe you’ve done this and there’s just no other tokens) could be a helpful exercise. What happens if you lower market cap to $50m from $100m? It would also reduce the dominance of LINK in a sqr root of market cap index.

Really quickly on liquidity weights, if all tokens but LINK have liquidity constraints, having a liquidity weight gives LINK token a higher allocation at the expense of the other tokens which is something you are trying to avoid.

1 Like

Really impressive analysis. I have not seen the Infrastructure/Middleware proposal by DeFi Pulse - is their proposal limited to Ethereum?

1 Like


How do you think we are “handpicking tokens”? We have defined an objective criteria (Token Inclusion List) and instantiated a Token List with 4 tokens that meet all of those criteria.

Revenue and Earnings are both very commonly used for index construction. For example, Tesla was not added to the S&P 500 until December 2020 because it did not meet the S&P 500’s criteria of needing 4 straight quarters of GAAP profits.

Personally, I thought that was pretty stupid (I have been a major TSLA holder since mid 2019), and think Revenue is the right fundamental metric for an industry like the Data Economy experiencing rapid growth.

Yes, the economic activity component in the Token Inclusion criteria absolutely limits inclusions by design. Augur and Gnosis were both excluded from the DATA Token List for either having insufficient on-chain data-based economic activity. We view exclusion of tokens that do not fit the Token Inclusion Criteria as a feature, not a bug.

I disagree strongly with this point. As I noted in my comment to @DevOnDeFi:

DATA as designed has a higher market capitalization than all components in MVI; Filecoin alone has a higher market cap than all components in MVI and has been live for 6 years. DATA would already meet your goal to have 8+ tokens if Ethereum and Set Protocol supported cross-chain interoperability.

Lowering the market capitalization still excludes interesting data-based projects like FOAM and Robonomics Networks, but these projects would still be excluded due to the economic activity criteria.

This is a good point. I don’t think we are trying to avoid LINK having an outsized weight in the index given how much greater it’s market cap is than everything else so much as give relatively more weight to projects with smaller market caps that nonetheless have great data economy fundamentals (NMR’s weight relative to OCEAN’s is the prime example of this phenomenon).

Thanks for all the responses and feedback everyone.
So far there seem to be 3 main topics: # of tokens, purpose of economic weight, and Intrinsic Productivity

# of Tokens

There are many tokens that qualify for inclusion in DATA but aren’t on the Ethereum chain. We listed at least 4 that we would like to include but can’t for this reason. It’s possible that wrapped tokens like WFIL could be used if they get enough onchain liquidity. We did discuss reducing the mcap requirement to $30m I think but it didn’t make a big difference


This is very early stages for this sector with lots of near term potential in these tokens and for new tokens that come up as it develops. @verto0912 mentioned we use different data sources for economic activity of each and thats because something like or doesn’t exist yet. This obviously a tool we are looking to build and as Thomas said the coop can be a major part in shaping this industry in it’s infancy just like DPI and MVI.

Intrinsic Productivity

short summary on IP here
Even with a few tokens DATA has same value as DPI with easy market coverage, rebalances, lower volatility, etc. so I don’t think we need IP to get PMF. That being said I think the case for IP in DATA is stronger than DPI, MVI and maybe SMI.

First reason being that if we run our own nodes we can provide DATA holders with higher staking returns than any other nodes on the market because we don’t need to charge staking fees, we only take a small 95bps streaming fee. So historically ETFs outperform actively managed accounts and you get higher staking returns than anywhere else. Second unlike DeFi staking pro-rata rewards, most DATA economies can have superlinear rewards stakers. Nodes have access to even more jobs the more they have staked (bc of minimum requirement, bandwidth, etc.) and could theoretically monopolize the market and set prices at will, something not possible in DeFi. Third, with the amount of tokens we’ll ideally have in DATA we can go to nodes/pools and bargain for better reward splits.

This is all on top of the normal DeFi options available for tokens like LINK.

Economic Weight

I’d say jd captured the main reason here:

We are tracking economies and economies are networks so the strongest network will perform better over time, especially if their token price is undervalued to the fundamentals and then price increase leads to higher economic bandwidth for them to operate. We use sources from the protocol itself ( or reputable community members ( and verify on-chain, an aggregator would be nice though.

It’s the Data Economy Index, saying it’s weighted by economic activity has made sense to all the normie friends I’ve sent it to (like they buy DOGE an ADA) and cryptonatives.


Thank you @patb! I do not think DeFi Pulse has put forward a proposal yet, but I’ve heard it mentioned on the weekly Index Coop community calls a few times. So, I do not know if their proposal is limited to Ethereum-based projects or not in the long-term.

I actually don’t think your response to @DevOnDeFi covers “the space is not mature enough for an index.” My reaction based on what I’ve seen so far is “a phenomenal idea that’s too early.”

Why would doing this now benefit the space or the Coop? DPI wasn’t the first tokenized DeFi index. It seems that didn’t matter too much. If you answer this question, you win my support.

The other notion would be to broaden the theme. I’m hesitant to go that route because I think you guys are onto something great with DATA. Have you considered a broader scope? If so, what would that be?


@fallow8 and I caught up over Discord about his question:

Here’s a brief summary of our discussion:

  • Establish Index Coop as the Institution that Tracks the Data Economy: The value of launching DATA now is that it establishes the Index Coop as the premier institution tracking the Data Economy. DPI inherited the credibility that DeFi Pulse built over 2 years by tracking DeFi from the beginning of its development. We believe it’s really important that the Index Coop position itself as the category owner of the Data Economy, even with only 4 tokens in the index, before a different index provider does.

  • Number of Tokens Will Naturally Expand: As blockchain technology matures and cross-chain interoperability solutions develop, the number of tokens in the index will naturally expand. There would be 8 tokens in the index now if Filecoin, Siacoin, Arweave, and Akash Network were ERC-20s.

  • Creation of a Public Scoreboard to Build Brand: People love scoreboards. There would be lots of brand and reputation value in the index methodologists (@kiba and I) maintaining a website with the Data Economy Token List, akin to what DeFi Pulse has done, showing which assets are included in DATA and which are not. For instance, there are many great DeFi assets, like Curve (CRV), which are included on DeFi Pulse, but not in DPI for various reasons.

@fallow8 Do you agree with this summarization of our conversation? Is there anything I missed or you feel I misrepresented?

1 Like

Hello @Don-ETH - thank you for your thoughtful comment!

Here are some of my thoughts on different points you have made:

What do you mean by risk? If you are referring to volatility, serious cryptoasset investors are quite fine with volatility given that even the crypto blue chip assets like BTC and ETH dropped ~50% in the past month in USD-terms. DPI also declined >50% from peak to trough despite having 14 assets in the index.

DATA would not have fared any better on this dimension, regardless of whether it included 4 or 8 tokens. The original 4 tokens described in the post all lost 65%+ of their value from peak to trough; LINK declined almost 70%! Filecoin, Arweave, Siacoin, and Akash Network all declined by 70%+ in the recent drop so including them would not have decreased volatility of the DATA index materially.

If you are referring to idiosyncratic or specific risk, then having 8+ tokens instead of 4 would certainly be better.

As I noted in some of my other replies, there would be 8 tokens in the index now if Filecoin, Siacoin, Arweave, and Akash Network were ERC-20s.

1 Like

That covers it. For my part, our conversation turned me around on “it’s too early” as an objection. That may still affect the product in the prioritization process, but it’s not a reason not to do it. Especially because DG1 to DG2 can really refine a product. This should be on the list if you believe in the trajectory of the Data Economy as @Kiba and @Thomas_Hepner lay out (which I do). And it gives us the unique opportunities @Thomas_Hepner covers above (entrenching the Coop as a locus for signal effect in burgeoning markets).


I want to start this off by saying that this proposal sets the standard for future Index proposals. This is the exact level of quality and attention to detail our community should expect from future proposals.

From reading through the comments these seems to be two main concerns:

  1. Defining economic activity is difficult and the value of using it as an inclusion criteria is uncertain

I tend to agree here - for these broad thematic indices investors want extremely transparent methodologies that are also conceptually simple. Investors from all backgrounds value the simplicity of methods like market weighting because they are well established and battle tested in a variety of market conditions. The only real variable is the data input. Adding another layer of complexity of selection may also add another layer of un-anticipated risk.

For broad thematic indices I tend to support this approach. With that said - if you guys’ strongly feel like this is necessary it definitely warrants further discussion.

  1. The current token basket feels limited

Spending some more time hashing out exactly what tokens we want to include would be very helpful. I see a strong argument for sidechains and potentially select L2 infrastructure being included in this indice.

With that said - I am strongly in favor of moving this forward. I also see this as an opportunity to quickly release an index that has clear market fit and minimal engineering requirements. Our bias needs to be towards simple solutions and simple implementations. I would rather get v1 of these indices out the door than spend unnecessarily time speccing out more elaborate architecture.


@BigSky7 Please see my responses to your comments!

Firstly, thank you - I am so flattered! :blush:

I could not agree more this comment! This is the approach that @kiba and I believe will lead to success for the Data Economy Index (DATA) and the Index Cooperative.

I think this is a great summary of the outstanding issues and concerns with the methodology, @BigSky7!

I want to see more idea and thoughts from the communities, but we have definitely taken note of everyone’s thoughtful comments in the discussion and will carefully consider them together before drafting an IIP for a DG1 vote!

@Kiba and I are thrilled to work with the Index Coop community to build an index methodology that everyone is excited to launch!


@DevOnDeFi and I caught up over Discord about his concerns, primarily,


Here’s a brief summary of our discussion:

  • @DevOnDeFi : I buy a lot of the reasoning for keeping the DATA allocation as is IF:
    1. Middleware is NOT on the table because it’s agreed this is owned by DFP
    2. ERC20s only
    3. No wrapped tokens

Re point 1: I’d like to see DFP focusing on FLI stuff for a bit (great product market fit and revenue opportunity for them and Coop) if that enables you and Kiba to have a wider landscape to build your index in. Many people I have spoken to in the Coop over the last 3-4 months have desired a broader index product, covering data and middleware (and more). In my mind I’ve been thinking it is the ‘Decentralized Tech Stack Index’. With no forum post or lead from DFP it doesn’t seem right to reduce the landscape you get to build in.

  • On Wrapped Tokens: We used wrapped BTC elsewhere in Coop products, I’d be totally cool for using wrapped tokens for a period of time (until Set/Coop has multichain tech infra)… So the four tokens you mentioned in one of your replies I’d be happy to see in the index wrapped.

  • On Thematic Nature of the Product and seeking to achieve product-market fit and good marketing attributes: My main thought here is:

    • What the THE catch all term for this sector?
    • I don’t know what it is beyond maybe decentralized infra / cloud / stack
    • Could they all be called Web3? (Maybe not). Or Decentralized Tech Stack? I think this works best for me
    • Macro: I think the wider index product theme (more than just data) enables more growth and value add to the Coop at this stage IF it’s possible and there aren’t political or technical blockers.

@DevOnDeFi Do you agree with this summarization of our conversation? Did I miss or misrepresent anything?

1 Like

I agree that this is a fair representation of our conversation - thank you for summarizing.

I hope you and the Coop get to build a narrower index at a minimum (even if a few tweaks/updates are required by the community), and hopefully even a wider themed one too. It’s great to see this post arrive, focus minds and generate discussion. There’s definitely some real opportunity in this area.


1 Like