We hear a lot about democratization—the action of making something accessible to everyone—in public policy. But democratization also applies to data, and it’s changing the way we drive successful client outcomes in financial services.
Data Drives Decision-Making
As head of business and technology solutions at William Blair Investment Management, it’s my job to integrate technology into our operations—including investment processes, of course, but also marketing and distribution.
While my team focuses on technological solutions—spanning data engineering, data governance, technology development, data visualization, and analytics—technology would serve no purpose without data. Data underpins every decision we make.
Given that we manage equity and debt portfolios, much of this data is financial in nature. Historically, it has largely been quantitative data specific to companies and how they operate—for example, balance-sheet data, income-statement data, and cash-flow data.
Some of our data is proprietary, spanning custom methodologies, calculations, and portfolio-specific data. But we also work with external data, coming from large providers such as Refinitiv, Bloomberg, and FactSet.
We’re also increasingly exploring what we now call alternative data. A few years ago, alternative data consisted of obscure things like satellite imagery of a retail company’s parking lot, but now it includes, for example, key performance indicators and environmental, social, and governance (ESG) data.
More Data, New Challenges
There’s so much more of this data today than there was in the past. The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching 181 zettabytes in 2025, as the chart below shows.
Raw data also requires cleaning and preparation to be useful. For application in investment research, the data must be stitched together and mapped to a specific company or security, which is a task that requires data engineering.
It’s also important to process and store the data in an agnostic format to minimize copying and maximize the number of access mediums able to read the data. And alternative data, such as news and sentiment data, can be subjective and more difficult to standardize and store.
Last, it’s critical to provide because we don’t want to repeat the same work for every task or project. There may be multiple use-cases for a given dataset, and we need to ensure that the data is readily available for all applications so we don’t repeat the same work for each use-case.
Democratizing Data Is Key
In trying to solve these challenges, we’re sharing data and expanding access mediums. This is the concept of data democratization: give many end-users access to data while minimizing the barriers between the user and the data. The end-users, such as research analysts, can then take the data and embed it in their financial models or business intelligence (BI) tools such as Power BI and Tableau, and do further analytics without the need of a technologist or developer. The developers, meanwhile, can access data in multiple environments, be they REST API, Python, or R. Minimizing the barriers between the user and the data contributes to scalable data democratization.
In the past, our colleagues have been able to do this, to some extent, but it hasn’t always been easy and it is time consuming. But advancements in technology will make data democratization easier going forward.
Creating a Data Lake
One step we’ve taken toward data democratization is embracing and adopting new cloud-based technology and development, so we now have a production-ready data lake to create next-generation data access.
A data lake is a repository of data stored in its natural or raw format. Data flows in from various sources, through our data lake, to a consumption zone, where it is made available for easy access. This data flow process is standardized and repeatable, and it reduces the need for making copies of data. It also enables better management of data quality.
Data lakes can not only accommodate larger amounts of data but can do so at a lower cost relative to data warehouse alternatives. By separating your data storage from your data processing technology, while still keeping them in close proximity, you can realize further gains in scalable data democratization.
Working With Existing and New Data
How does data democratization work with existing, curated data? One example comes from our quantitative research team, which has developed a library of quantitative factors and model data used across teams in their research. This data library was previously only available in closed or semi-open systems with proprietary data formats. Now it is more easily shared, and other teams will be able to readily access and incorporate it into their analysis directly.
Data democratization also enables faster turnaround time for the processing and use of new data. For example, we’re working with a number of new datasets, including data related to greenhouse gas emissions at the company and portfolio level. Specifically, we are providing access to emissions metrics for companies so that analysts can incorporate this into their fundamental analysis. These include induced scope 1, 2, and 3 emissions (and avoided emissions when products/services are enabling efficiency or decarbonization). We’re also beginning to use the data to understand indicators at the portfolio level so that portfolio managers can measure risks and opportunities related to the energy transition. All this requires us to share data for easy access and analysis, and we’re able to do that thanks to our data democratization efforts.
Improving Existing Processes
Data democratization can also improve existing processes. As an example, one of our analysts needs to compare quarterly and annual data reported by numerous by companies. Because those companies have different fiscal years and reporting cycles, it’s hard to compare apples to apples without manipulating the data. When it became too time-consuming to do that in Excel, the analyst asked my team for help. We got her the raw data and thought the project was finished, until she mentioned she didn’t have the deep coding skill needed to continue manipulating the data. Thus, data democratization can involve many steps—it’s not a “set-it-and-forget-it” improvement.
Fit for the Future
The advancement of technology solutions now will make it easier to shift the time spent on accessing data to analyzing data, and our efforts to democratize data can benefit our colleagues (and ultimately our clients) in many ways.
First, our colleagues can access and analyze the data directly via a self-service model. Recently, for example, someone in sales received a question from a prospective client. Its investment committee had a much lower forward price-to-earnings (P/E) ratio for an index than we had calculated. Therefore, we pulled the constituent data, broke down how we calculated P/E, then explained the ways one could calculate it to get a number closer to what the prospective client had calculated—all without going to a developer.
Second, our colleagues spend less time accessing and stitching together data, which improves their workflows. Time is shifted away from collecting information to analysis and making decisions.
Third, efficiency in technology and process workflows reduces costs, which benefits our clients.
The investment in technology to facilitate scalable data democratization will enable asset managers to do more with fewer resources than would otherwise be needed. This is a new and evolving area, and we are excited to see where this takes the asset management industry. We want to push the computers to the limit of what they can do, and then step in as humans only to do what computers can’t. And I think we’re making great progress.
Stay tuned for a future blog post about key lessons we’ve learned from our data democratization efforts.
Kristina Blaschek is director of business and technology solutions for William Blair Investment Management.