Request for Discussion: Data Economy Index (DATA)


How do you think we are “handpicking tokens”? We have defined an objective criteria (Token Inclusion List) and instantiated a Token List with 4 tokens that meet all of those criteria.

Revenue and Earnings are both very commonly used for index construction. For example, Tesla was not added to the S&P 500 until December 2020 because it did not meet the S&P 500’s criteria of needing 4 straight quarters of GAAP profits.

Personally, I thought that was pretty stupid (I have been a major TSLA holder since mid 2019), and think Revenue is the right fundamental metric for an industry like the Data Economy experiencing rapid growth.

Yes, the economic activity component in the Token Inclusion criteria absolutely limits inclusions by design. Augur and Gnosis were both excluded from the DATA Token List for either having insufficient on-chain data-based economic activity. We view exclusion of tokens that do not fit the Token Inclusion Criteria as a feature, not a bug.

I disagree strongly with this point. As I noted in my comment to @DevOnDeFi:

DATA as designed has a higher market capitalization than all components in MVI; Filecoin alone has a higher market cap than all components in MVI and has been live for 6 years. DATA would already meet your goal to have 8+ tokens if Ethereum and Set Protocol supported cross-chain interoperability.

Lowering the market capitalization still excludes interesting data-based projects like FOAM and Robonomics Networks, but these projects would still be excluded due to the economic activity criteria.

This is a good point. I don’t think we are trying to avoid LINK having an outsized weight in the index given how much greater it’s market cap is than everything else so much as give relatively more weight to projects with smaller market caps that nonetheless have great data economy fundamentals (NMR’s weight relative to OCEAN’s is the prime example of this phenomenon).

Thanks for all the responses and feedback everyone.
So far there seem to be 3 main topics: # of tokens, purpose of economic weight, and Intrinsic Productivity

# of Tokens

There are many tokens that qualify for inclusion in DATA but aren’t on the Ethereum chain. We listed at least 4 that we would like to include but can’t for this reason. It’s possible that wrapped tokens like WFIL could be used if they get enough onchain liquidity. We did discuss reducing the mcap requirement to $30m I think but it didn’t make a big difference


This is very early stages for this sector with lots of near term potential in these tokens and for new tokens that come up as it develops. @verto0912 mentioned we use different data sources for economic activity of each and thats because something like or doesn’t exist yet. This obviously a tool we are looking to build and as Thomas said the coop can be a major part in shaping this industry in it’s infancy just like DPI and MVI.

Intrinsic Productivity

short summary on IP here
Even with a few tokens DATA has same value as DPI with easy market coverage, rebalances, lower volatility, etc. so I don’t think we need IP to get PMF. That being said I think the case for IP in DATA is stronger than DPI, MVI and maybe SMI.

First reason being that if we run our own nodes we can provide DATA holders with higher staking returns than any other nodes on the market because we don’t need to charge staking fees, we only take a small 95bps streaming fee. So historically ETFs outperform actively managed accounts and you get higher staking returns than anywhere else. Second unlike DeFi staking pro-rata rewards, most DATA economies can have superlinear rewards stakers. Nodes have access to even more jobs the more they have staked (bc of minimum requirement, bandwidth, etc.) and could theoretically monopolize the market and set prices at will, something not possible in DeFi. Third, with the amount of tokens we’ll ideally have in DATA we can go to nodes/pools and bargain for better reward splits.

This is all on top of the normal DeFi options available for tokens like LINK.

Economic Weight

I’d say jd captured the main reason here:

We are tracking economies and economies are networks so the strongest network will perform better over time, especially if their token price is undervalued to the fundamentals and then price increase leads to higher economic bandwidth for them to operate. We use sources from the protocol itself ( or reputable community members ( and verify on-chain, an aggregator would be nice though.

It’s the Data Economy Index, saying it’s weighted by economic activity has made sense to all the normie friends I’ve sent it to (like they buy DOGE an ADA) and cryptonatives.


Thank you @patb! I do not think DeFi Pulse has put forward a proposal yet, but I’ve heard it mentioned on the weekly Index Coop community calls a few times. So, I do not know if their proposal is limited to Ethereum-based projects or not in the long-term.

I actually don’t think your response to @DevOnDeFi covers “the space is not mature enough for an index.” My reaction based on what I’ve seen so far is “a phenomenal idea that’s too early.”

Why would doing this now benefit the space or the Coop? DPI wasn’t the first tokenized DeFi index. It seems that didn’t matter too much. If you answer this question, you win my support.

The other notion would be to broaden the theme. I’m hesitant to go that route because I think you guys are onto something great with DATA. Have you considered a broader scope? If so, what would that be?


@fallow8 and I caught up over Discord about his question:

Here’s a brief summary of our discussion:

  • Establish Index Coop as the Institution that Tracks the Data Economy: The value of launching DATA now is that it establishes the Index Coop as the premier institution tracking the Data Economy. DPI inherited the credibility that DeFi Pulse built over 2 years by tracking DeFi from the beginning of its development. We believe it’s really important that the Index Coop position itself as the category owner of the Data Economy, even with only 4 tokens in the index, before a different index provider does.

  • Number of Tokens Will Naturally Expand: As blockchain technology matures and cross-chain interoperability solutions develop, the number of tokens in the index will naturally expand. There would be 8 tokens in the index now if Filecoin, Siacoin, Arweave, and Akash Network were ERC-20s.

  • Creation of a Public Scoreboard to Build Brand: People love scoreboards. There would be lots of brand and reputation value in the index methodologists (@kiba and I) maintaining a website with the Data Economy Token List, akin to what DeFi Pulse has done, showing which assets are included in DATA and which are not. For instance, there are many great DeFi assets, like Curve (CRV), which are included on DeFi Pulse, but not in DPI for various reasons.

@fallow8 Do you agree with this summarization of our conversation? Is there anything I missed or you feel I misrepresented?

1 Like

Hello @Don-ETH - thank you for your thoughtful comment!

Here are some of my thoughts on different points you have made:

What do you mean by risk? If you are referring to volatility, serious cryptoasset investors are quite fine with volatility given that even the crypto blue chip assets like BTC and ETH dropped ~50% in the past month in USD-terms. DPI also declined >50% from peak to trough despite having 14 assets in the index.

DATA would not have fared any better on this dimension, regardless of whether it included 4 or 8 tokens. The original 4 tokens described in the post all lost 65%+ of their value from peak to trough; LINK declined almost 70%! Filecoin, Arweave, Siacoin, and Akash Network all declined by 70%+ in the recent drop so including them would not have decreased volatility of the DATA index materially.

If you are referring to idiosyncratic or specific risk, then having 8+ tokens instead of 4 would certainly be better.

As I noted in some of my other replies, there would be 8 tokens in the index now if Filecoin, Siacoin, Arweave, and Akash Network were ERC-20s.

1 Like

That covers it. For my part, our conversation turned me around on “it’s too early” as an objection. That may still affect the product in the prioritization process, but it’s not a reason not to do it. Especially because DG1 to DG2 can really refine a product. This should be on the list if you believe in the trajectory of the Data Economy as @Kiba and @Thomas_Hepner lay out (which I do). And it gives us the unique opportunities @Thomas_Hepner covers above (entrenching the Coop as a locus for signal effect in burgeoning markets).


I want to start this off by saying that this proposal sets the standard for future Index proposals. This is the exact level of quality and attention to detail our community should expect from future proposals.

From reading through the comments these seems to be two main concerns:

  1. Defining economic activity is difficult and the value of using it as an inclusion criteria is uncertain

I tend to agree here - for these broad thematic indices investors want extremely transparent methodologies that are also conceptually simple. Investors from all backgrounds value the simplicity of methods like market weighting because they are well established and battle tested in a variety of market conditions. The only real variable is the data input. Adding another layer of complexity of selection may also add another layer of un-anticipated risk.

For broad thematic indices I tend to support this approach. With that said - if you guys’ strongly feel like this is necessary it definitely warrants further discussion.

  1. The current token basket feels limited

Spending some more time hashing out exactly what tokens we want to include would be very helpful. I see a strong argument for sidechains and potentially select L2 infrastructure being included in this indice.

With that said - I am strongly in favor of moving this forward. I also see this as an opportunity to quickly release an index that has clear market fit and minimal engineering requirements. Our bias needs to be towards simple solutions and simple implementations. I would rather get v1 of these indices out the door than spend unnecessarily time speccing out more elaborate architecture.


@BigSky7 Please see my responses to your comments!

Firstly, thank you - I am so flattered! :blush:

I could not agree more this comment! This is the approach that @kiba and I believe will lead to success for the Data Economy Index (DATA) and the Index Cooperative.

I think this is a great summary of the outstanding issues and concerns with the methodology, @BigSky7!

I want to see more idea and thoughts from the communities, but we have definitely taken note of everyone’s thoughtful comments in the discussion and will carefully consider them together before drafting an IIP for a DG1 vote!

@Kiba and I are thrilled to work with the Index Coop community to build an index methodology that everyone is excited to launch!


@DevOnDeFi and I caught up over Discord about his concerns, primarily,


Here’s a brief summary of our discussion:

  • @DevOnDeFi : I buy a lot of the reasoning for keeping the DATA allocation as is IF:
    1. Middleware is NOT on the table because it’s agreed this is owned by DFP
    2. ERC20s only
    3. No wrapped tokens

Re point 1: I’d like to see DFP focusing on FLI stuff for a bit (great product market fit and revenue opportunity for them and Coop) if that enables you and Kiba to have a wider landscape to build your index in. Many people I have spoken to in the Coop over the last 3-4 months have desired a broader index product, covering data and middleware (and more). In my mind I’ve been thinking it is the ‘Decentralized Tech Stack Index’. With no forum post or lead from DFP it doesn’t seem right to reduce the landscape you get to build in.

  • On Wrapped Tokens: We used wrapped BTC elsewhere in Coop products, I’d be totally cool for using wrapped tokens for a period of time (until Set/Coop has multichain tech infra)… So the four tokens you mentioned in one of your replies I’d be happy to see in the index wrapped.

  • On Thematic Nature of the Product and seeking to achieve product-market fit and good marketing attributes: My main thought here is:

    • What the THE catch all term for this sector?
    • I don’t know what it is beyond maybe decentralized infra / cloud / stack
    • Could they all be called Web3? (Maybe not). Or Decentralized Tech Stack? I think this works best for me
    • Macro: I think the wider index product theme (more than just data) enables more growth and value add to the Coop at this stage IF it’s possible and there aren’t political or technical blockers.

@DevOnDeFi Do you agree with this summarization of our conversation? Did I miss or misrepresent anything?

I agree that this is a fair representation of our conversation - thank you for summarizing.

I hope you and the Coop get to build a narrower index at a minimum (even if a few tweaks/updates are required by the community), and hopefully even a wider themed one too. It’s great to see this post arrive, focus minds and generate discussion. There’s definitely some real opportunity in this area.


1 Like

I like the idea of a broader Decentralized Tech Stack idea.

Partly because that’s the broader motif I’d like to add to my own portfolio, but also it appears to solve one of the other primary concerns of a new product that’s limited to 4 tokens to start.

This may not be a perfect parallel but it seems as if we’re trying to build another DPI with this product. However, instead of capturing a broad spectrum of DeFi, we limited it to DEXes or Lending only before creating the broad category. I think we could start broadly, then narrow down if there’s demand.


I am in support of this proposal. Much like MVI, I believe there is a lot of value in what’s being built in the space, but I wouldn’t be comfortable trying to pick winners, so I would find an index quite attractive. My comments echo some that have already been raised, and my background is in risk so some views may not be mainstream, but I will run through them here:

  • Inclusion:
    I would like to see a very clean definition of what this index covers. I’m currently left wondering if ETH 2.0 staking providers may fall under this umbrella? Uniswap is getting into the oracle space - will they be considered? I think it may become important to have a clear definition of what this is, as, “capturing the growth of on-chain data economies” will likely most often be met with, “What’s an on-chain data economy?” (which is fine if we have an awesome answer; I currently don’t) rather than, “That’s awesome, where do I buy it?”

  • Weighting:
    I would be interested to see some analysis of various index weighting strategies wherein there is one dominant component in a small basket. I’m not opposed to the proposed strategy, I just don’t find it very easy to understand - and in turn - easy to explain to a potential investor. If it can be linked to more TradFi rebalancing metrics yet still minimize heavy weighting toward one protocol, I would be more comfortable with it.

    • It might be worth exploring a 2-layer weighting strategy (50% of the total basket weight may be attributable to 30 day average MC, 50% attributable to 30 day average volume relative to MC . . .
    • Maybe it’s as simple as, "If there are 4 tokens, weighting will be 40-30-20-10% . . . if there are 5 tokens, weighting will be 35-25-25-10-5 . . . if there are 6 tokens . . . ".
    • Maybe it’s by MC, but capped at double its equal-weighting percentage - 4 tokens at 25% each would cap a single token at 50% even if its MC would put it at 90% . . . the remaining tokens split the other 50% by MC.

    I’m spitballing here, but I think if it’s kept simple it will be more likely that someone will invest.

  • Availability:
    I believe maybe the most challenging aspect of this index (for now) would be ERC-20 tokenized availability. There will eventually be a lot of players in this space, but I’m concerned that at this time there are only a few viable candidates ‘On Ethereum’. I expect that there will likely be synthetics for most non-ethereum-native providers. Furthermore, having a token is fine, but I would want to be sure that the token is integral to the protocol (e.g. - are there solid tokenomics - a real limitation to some staking providers and L2s - or even say Storj which doesn’t require that users use the token to pay for the service). I think this specific challenge will diminish greatly over time.

I don’t see any of these things as blockers, but clarity around the topics without unnecessarily limiting the ability to govern this specific index down the road would be helpful, not just for our own thinking, but with an eye toward marketing and large-scale adoption. This is a solid proposal @Thomas_Hepner and @Kiba - it’s well thought out, well researched, and I’m genuinely looking forward to this discussion developing further.


Thanks for the response mel!

Inclusion: Technically you could say ETH, L2s and stakers could be in DATA Index since they are a type of cloud computing. We want to have a token diversified outside of Ethereum so we are focusing on chain-agnostic technologies/tokens which would disqualify something like LIDO or MATIC.

Weighting: Me and Thomas are reviewing community feedback on this and will provide an update soon

Availability: Ideally wrapped and/or synthetic options build up soon, FIL has both version but not a lot of liquidity for either. In another (hidden?) tab in the matrix score sheet we categorize tokenomics and how that plays into economic weights.

1 Like

Excellent, I support this proposal, this is a product I want in my portfolio. I am particularly excited by the IP potential, a lot of interesting things to do here.

Just a concern regarding the weight of Numerai. 24% of the index seems very high for a project whose economic bandwidth is delivered by a single centralized entity (a hedge fund with a low level of transparency). Or maybe this has changed? I have not looked at the project for a while. If it’s still the case, I think the EW should take this into account in some way.


Thank you for commenting, @trx314 - Very much appreciate your support for the Data Economy Index!

@Kiba and I are researching modifications to both the Token Inclusion Criteria and Token Weight formula that address the community’s concerns around (1) not enough tokens in the index and (2) a complicated index weight calculation.

We have a draft proposal that is not quite ready for the forums that we think will address the concerns around overconcentration in Numeraire you brought to our attention :grinning:


Hi Thomas,

Little bit later to the party here, but here are some thoughts.

1. Weighting

I have less of a problem with the complexity of the weighting. I would imagine that many of the investor in DATA would be more technical (we tend to buy what we know) as so appreciate a more nuanced methodology.

Furthermore, I see part of DATA’s USP being made on the weighting and rebalancing metrics allowing you to identify and allocate towards winners early. Therefore your index provides value because it identifies good picks based on the key metrics you have identified.

2. Number of tokens included
I agree with @DevOnDeFi and others that ideally the number of tokens would be great. But as you have stated, this index should continue iterate, grow and evolve. I would like to see DATA’s launch as a major driver for a medium to long term to push for inclusion of non ERC20 tokens in our products. Cross ecosystem products would be a major differentiator and competitive advantage which should be striving to unlock DATA helps us do this.

3. Engineering
I think we can all agree this is the single biggest issue facing the Coop at the moment - and it must be tough to deal with this challenge as a prospective methodologist given that DATA is not particularly demanding on engineering resource.

Maybe you could explore some workable solutions to mitigate this challenge and come back to the Coop with some ideas that would help address the dev resource constraints. It would also set a useful precedent for methodologist in future on how to overcome this same engineering bottleneck.

1 Like

Hello @Pepperoni_Joe - thank you for the thoughtful reply!

Here are my thoughts on each of your comments:

1. Weighting:

While I do agree with this, and like this product more as a cryptoasset investor, I think I’ve come to the view that the purpose of a sector index like DPI, MVI, and hopefully DATA is to define and provide broad exposure to a defined category. After we’ve launched DATA as a sector index, I think it could make sense to explore DATA-based indices with more complex methodology. For instance, you could imagine a vanilla DATA index with weights solely determined by market capitalization as well as a smart beta DATA index that would give more weight based on economic activity factors like growth, TVL, revenue, profit, etc.

2. Number of tokens included:

@Kiba and I have done some more research based on feedback and I think we’ll be able to come back with a new proposal that will have substantially more tokens (hopefully 8+ tokens will be in the next proposal).

Yes, this is definitely my view. In the long-run, I believe Set Protocol and Index Cooperative will need to become multi-chain to achieve their full potential.

3. Engineering:

:100: Could not agree more with this comment! I’ll be exploring how we can unblock simple index portfolios and get DATA launched as well as many more products from other methodologists.


Please see the latest updates to the Data Economy Index proposal here:

@Thomas_Hepner feel free to reach out if u r in need for some engineering input. Would be happy to help remove any open questions whether surrounding smart contract or u simply need to bounce some ?s off me and talk thru some possible blockers