The world of data is in complete effervescence. While the cloud computing market is set to reach an estimated $275 billion this year, Analytics, AI, Machine Learning and Data technology companies seem to have handsomely benefited from the $30B venture capitalists have invested in this space in the past couple of years.
In a world where new startups seem to emerge weekly, CIOs are challenged to make sense of the data market. Let alone decide which technology paradigm to bet on as they build their “modern data stack.”
Luckily, there is Matt Turck. A “Poly Sci” major by training, Matt immigrates to the U.S. from France some 20 years ago, sells a company he had co-founded to Oracle to finally make his way to the venture capitalist world. He’s currently Managing Director at FirstMark Capital, an early-stage VC firm based in New York City. Matt is passionate about data. So, in 2012 he decides, with his colleague Shivon Zilis, to publish a chart of the big data ecosystem. The graphic is an attempt to group companies by category to help colleagues wrap their heads around the fast growing ecosystem. They decide the community could benefit from it too so they publish it on their blog. And voila, the “Big Data Landscape” is born.
Fast-forward to last week: Matt and team publish the 2021 Machine Learning, AI and Data Landscape. The number of companies covered has vastly increased. And the technology breadth of the map has expanded from “just Data” to “ML, AI and Data” or MAD. Trust me, you will be GLAD (!) to read Matt’s analysis of the industry’s latest developments.
I had the opportunity to talk to him this week and here what we discussed CIOs and Data Leaders should consider when investing in this space.
MORE FOR YOU
Filters & Translators agree
You can (and should) consider using industry analysts’ tools to evaluate vendors and assess trends. One of my favorites is Gartner’s Hype Cycle because it filters the noise and translates the relevance of specific technologies as they evolve through their maturity phases (if you’re curious about how Hype Cycles work, watch my quick explainer here).
Whichever tool you decide to use to filter information or translate the trends, one thing that’s for sure: the data foundation has now been laid out. Long gone are the days of unscalable on-premises databases or difficult-to-manage Hadoop clusters. The market has now mostly built the infrastructure required to help any company of any size store data in the cloud securely.
What does this mean?! If you felt the pace of innovation in this field was fast, buckle up. It’s about to get into high gear. This checks out with data points we’ve discussed in the past; recently, Gartner observed that 86% of the total database management system market revenue growth in 2020 was cloud-based, and predicted that by 2023, 75% of all databases will be on a cloud platform.
Bottom-line: now that the foundational aspects of reliable data platforms are available for applications to take advantage of, one can only guess that innovations on top of that platform are about to proliferate (either built by internal data teams using intelligent frameworks or bought from innovative startups).
From “VS.” to “AND” to “WITH”
Matt notes in his piece that convergence is happening and “it’s healthy.” “It is time for the data industry to evolve beyond its big technology divides: transactional vs analytical, batch vs real time, BI vs AI.” I agree. I’ll even go a step further actually.
Earlier this year, I wrote about how CIOs needed to be wary of the “VendorSpeak”—the non-malicious bias technologists often fall prey to when trying to apply the solution they have built to more and more contiguous use-cases.
What CIOs ought to focus on are the use-cases their teams are on point to build and how their work is directly correlated to business value and differentiation for their company. Trying to make a data lake look, smell and work like a data warehouse could be restrictive and might lead to disappointments.
The best methodology I’ve seen CIOs succeed with is what’s called “Center of Design.” For every vendor that comes through your doors, assess their offer by zeroing in on what “each product is best applied for.” The best way to assess a solution’s Center of Design is to align it to the “Big 3”: users, data and processes.
Take the example of a data warehouse vs. a data lake. What data is the solution best for? Structured or Unstructured Data? What user persona is it optimized for? Analysts or Data Scientists? What type of use case is it best suited for? Analysis or Modeling and Exploration?
Will you need a data platform that enables both “transactional WITH analytical,” “batch WITH real time” and “BI WITH AI.” You bet. But, if the vendor tells you that they have one tool that can “solve it all,” watch out. Step back and ask: does your need fit in an area of overlap or will the proposed ‘one-size-fits-all’ approach be stretching your solution beyond its true center of design?
I highly suggest you block 30 mins on your calendar this coming week to read the 2021 Machine Learning, AI and Data (MAD) Landscape. The piece is well researched and comes with a lot of additional resources (my favorites are Matt’s “S-1 Teardown” blogs where he dissects the SEC filings of companies planning on going public).
When I asked Matt how CIOs should use his research to select the right companies, we agreed on at least 2 practices.
1 – Think like a venture capitalist
When selecting technology from a new vendor, don’t narrow your view to the founder’s narrative on venture rounds and astronomical valuations. Assess the company on the fundamentals: yes, the company might have accumulated a large amount of capital over the last few quarters. And from prestigious venture capitalist firms as well. That’s great.
But, what has been the employee growth and churn in the same time period? What has been the customer count growth and churn? Benchmark the company using some of the recent research published by firms like Bessemer: at $100M ARR, a company’s expenses average tend to be 35% for Research & Development, 50% for Sales & Marketing, and 20% for General & Administrative (G&A).
Remember that valuations are at an all-time high right now: between 2020 and 2021, average multiples have increased to 20X, and top private cloud companies are receiving an even higher 34X. As many as 90% of startups fail, so you might also want to use tools like LinkedIn Top Startups 2021 to gather more information.
2 – Think like an engineer
Some of the most popular technology fields are riddled with unfulfilled promises. Take Artificial Intelligence: IDC predicts that AI spending will reach $342B in 2021 and accelerate in 2022 to break the $500 billion mark by 2024. 95% of tech leaders consider AI to be important in their digital transformation efforts.
When evaluating new technology, ask to see similar customers’ metrics that relate to “operationalization.” For instance, “what is your customers’ average time from pilot to production…and back” (agility to roll out and back might be very relevant to your use-case).
Finally, don’t just stop at a restricted pilot. Commit to driving your test pilot into production, even if it requires additional service costs. What matters is not just a successful pilot. Production speed and agility metrics should be your focus. This practice will also allow you to test your team’s ability to ingest and operationalize a technology and its ecosystem.