Why Cloudera interrogates diversity data for the greater good

Why Cloudera interrogates diversity data for the greater good

 April 07, 2021

Cloudera understands that its own data around diversity, inclusion and equality is key. The company is on a journey to ensure it has the data it needs to hold itself accountable for the DE&I strategies and programs that it puts in place.

Interrogating data for the greater good

In January 2021, data scientist Solomon Makoni, joined the company to drive its DE&I focus. With experience across all aspects of product development and leadership in the AI, Data Science, and Data Engineering, Solomon works in data because, for him, data matters: “Whether you opt-in or not, decisions are made every day using data that affect each and every one of us,” he says.

He explains that the only way to make sure that the decisions made are fair is to participate in the collection, curation, and analysis of that data. 

“People like to say that data does not lie; that’s not strictly true. Data is complicated and it can be used to drive progress or inhibit it,” he continues.

Solomon discusses how, arguably over the years, data, even diversity, equity and inclusion data, has been used to reinforce a system that “only benefits a few.”

“If we’re going to be guided by data and statistics then I strongly believe that we need society-focused individuals who are going to interrogate the data for the greater good,” he says.

“Data reflects the biases in society. We have to account for that bias in the data we’re working with, otherwise we’re allowing the lie to self-perpetuate,” Solomon adds. 

The problem of believing data does not lie


When it comes to equality, diversity and inclusion, Solomon details two key areas of oversight for companies, when it comes to collecting and relying upon data. Firstly, companies that start from the vantage point that data does not lie or that they have all the answers - “a perspective that is more prevalent than you might think,” adds Solomon - are missing the challenges people face.

For Solomon, this oversight has serious implications. As a society, we are missing out on our true potential by not tapping the wisdom and insights of those that do not get the chance to participate. “This is the data we need, the data of those on the fringes of our society. We need to know what their thoughts are, their ideals and what they would do differently if they could lead,” he adds.

“We are lucky that we’re living in a point in time where progress, small though it might be, is being made,” continues Solomon. But he adds that a lot of the data businesses are using is historic and not representative. It is based on “who got educated, who got the job, not who could have gotten the education or the job." The data therefore does not represent many women and underrepresented minorities (URMs.)

“Historical data represents a snapshot in time. We have to instead look at the entire pipeline, from school education, through to who gets into universities and scholarships/ converted tech programs etc.,” Solomon explains.

“When I think back to when I was a kid in grade school, the top five students were without question girls. Yet at some point, boys become prioritized by the system, this is the case with gender inequalities in our systems.”

Solomon insists that we need more data on how individuals move through and drop off from the pipeline and the bias they encounter along the way so that people can create more spaces for women and talk meaningfully about equality.

Taking responsibility for our own biases

Solomon moves onto the second key point: that we all must accept our role, because as humans we all have unconscious biases. 

While the default reaction to this suggestion is too often defensive, ‘I’m not biased’, Solomon asks: “What if we accept that we are all victims of a system that was not designed for people to progress together, but instead creates a bottleneck that only let’s a few through at a time? And the people who go through first are those that designed it?”

“We need to collect and analyze data on our own biases, normalize the fact that it’s natural to be biased, so that we can work to fix these biases by opening our circles to new people, new cultures and new ways of thinking. Otherwise known as being inclusive,” he says.

“I think we can all agree that the current system no longer works for society at large because it's not creating any new ideas or opportunities.” Instead, it is stagnating society by “repeddling archaic ideals that no longer resonate.”

Solomon states that if we can all agree that this system does not benefit any of us and in our own way, we are perpetuating it, then we open the door to helping each other.

If we adopt this approach, we embed a growth mindset for ourselves and others: “How can I as a data scientist help you pursue your career and how might you be able to connect me with a business mentor to give me leadership insights for the promotion I am pursuing?”

Understanding the intersectionality between race and gender

Solomon Cloudera

A keen interest of Solomon’s is the intersectionality between race and gender and gender identity overlaid with access to opportunities, privilege, and power. He believes that all too often women and men are treated as one group, “but that’s a very broad brush and unrepresentative view to take,” he explains. “We need to understand the experience of women, across all races, at a detailed level.”

He gives the example that an Asian woman from China will have a different experience to an Asian woman from Korea or India, which is why it’s so important to disaggregate the data. Otherwise, we are making what Solomon terms “binary comparisons”. Solomon is excited about the momentum Cloudera has on this issue because of the leadership of the company’s Chief Diversity Officer Sarah Shin, Colby Berger with his Talent Acquisition team, in partnership with the HRBPs and their business units, and the company’s HR teams.

“It takes all of us working together looking at the same numbers and definitions,” he adds.

Dissecting the entire employee lifecycle

Solomon emphasizes the importance of shining some light on the experiences of women and underrepresented minorities from when Cloudera sources a candidate, hires them, their first 90 days, promotions, and opportunities through to when they leave.

“The entire employee lifecycle needs to be dissected right from the moment of selection and first interviews,” he comments.

He elaborates with an example: “Is a woman coming from a data engineering role going to be at a disadvantage if she’s interviewed by a panel of three men? How does that experience break down across different ethnicities?”

Asking questions like these and interrogating the data a company uses, can begin to bring about sustainable change and “maybe even predict that a candidate in the interview process is not getting a fair shake, before we miss out on good people.”

How to improve data expertise

Data science

Solomon offers explanations as to why companies are lacking in their data expertise, and advice on how to improve it. Firstly, there is a concept of the system and that everyone is, he explains, “to some degree is enforcing it”.

“The process for how we do things is hard wired and end-to-end benefits a few. Therefore, whatever we implement inside the system will likely yield small successes, but not the progressive change we need because as it stands people are focused on the representation of their ‘own’ not on making space for people who look, think and act differently. But that is what equality is,” he explains.

Secondly, Solomon highlights the need for an intrinsic link between diverse goals and company goals. Measuring diverse teams and their performance is one thing, he says, but are we looking at the long term change diverse teams bring to companies? “Have they brought about a behavior change and how does that correlate with business performance over time? Is the culture learning and growing from more diversity?” he asks.

A further element is how diversity drives inclusion. “If we’re analyzing behavior change, how people act and think, then we can get a better sense of whether the company culture is opening up to be more inclusive of different perspectives or if women and URMs are having to be different people at work in order to conform to a historic system that we all know won’t save us,” he adds.

Contextualizing the role at Cloudera within wider society

Cloudera data

When Solomon thinks about his role at Cloudera, he considers it within the context of society - a society he would like to create for his children. He also reflects on how society has brought him to this point in his life and the advantages, as well as disadvantages, he has experienced. He thinks of the people around him, the opportunities that they have received and those that have been overlooked. He shares his hope and desire: “to make ourselves accountable for delivering on our ambitions to create a just and fair society.”

When it comes to DE&I work, Cloudera has good foundations in the form of strong support of its CEO. The company is looking to measure the components of how it is translating this commitment into the satisfaction of its diversity, equity, and inclusion goals.

“But we know what diversity is, what is equity or inclusion, and can we measure them?” Solomon asks. “Everyone has some hypotheses on why we are losing women and underrepresented minorities.”

Solomon’s role is to use scientific approaches rather than anecdotal evidence to collect and evaluate both hypotheses, and data, to test and prove or disprove. He cites an example: “We can say we have 10 women, 9 of them make $70k a year each, one makes $400k. One hypothesis is women are highly paid because on average they make $100k. Another hypothesis is women are underpaid because 9/10 of them make $70k. Who is correct?”

“You might say this is obvious, but you may be surprised as to how much we use these accessions without taking into account the biases and noise in the data. I see this as the primary aspect of my job to say anyone can have an opinion but to interpret data one needs a level of care that goes beyond opinions or even algorithms,” he concludes.

“I feel like I am well positioned because these misconceptions and misrepresentations have sometimes affected me personally.”

Continuously developing great expertise

Outside of work, Solomon has maintained a rigorous training and conference attendance schedule over the years to keep up with evolving advancements in technology culminating in a Masters in Data Science, multiple certificates in AI/Data/Machine Learning, Big Data and ETL, working with methodologies, technologies and tools like Spark, Hadoop, Python and AWS, GCP, Agile.

Working in the data and diversity arena is a constantly changing sphere – and one that Solomon and his colleagues are contributing to in hugely significant ways.

Thank you, Solomon!

Work for a company that uses data for good

Cloudera employees believe data can make what is impossible today, possible tomorrow. They help Cloudera deliver an enterprise data cloud for any data, anywhere, from the Edge to AI.

Inspired to work for a company with such a firm commitment to building a diverse workforce?

Discover the latest job vacancies with Cloudera.


Find out more

Stay connected by subscribing to our monthly newsletter and following us on LinkedIn, Twitter, Instagram and Facebook.

Disclosure: Where Women Work researches and publishes insightful evidence about how its paid member organizations support women's equality.

Share this page:

Join our women's careers community