Five Data Liberation Army
“Give me a data set big enough and I will move the world”
– Archimedes (sort of…)
The current global pandemic has taught us many things, from both societal and individual perspectives. But one thing it’s made clear to the uninitiated is the importance of data, and in particular, sharing and democratising data for public good.
Often in the mainstream, one hears the analogy of data being the new oil. This simple comparison has some merit, as both are very valuable to global society. Similar to oil, with control and ownership of data, comes great wealth and power – just think of some of the most powerful data barons ie FB, Google, …
The real issue arises when people/organisations take this analogy literally, by seeing it as a finite/scarce resource and hoarding it in silos and locking it down from broader use. As Data Scientist’s, we’re often faced by the challenge of gaining access, and in particular, understanding, of data that’s required for our analysis and modelling.
It’s important to realise that data is simply a means to an end – an enabler for effective data-driven decision making.
What successful organisations have realised in this age of data, is that to truly unlock its potential to achieve actionable outcomes, you must first create, enable and support a Data Culture.
Here are the 5 key things you need to get right that I’ve identified, having worked in numerous industries and domains throughout my career:
1. Data Democracy: To liberate your data, you first need to free it by helping break down silos and attitudes. The data ultimately belongs to the organisation, and not individuals within it, and its purpose is to enable the greater good of the organisation. The aim should be to grant access to anyone who wants/needs it and to help promote collaboration with experts ie business SMEs and Data Scientists, for instance. It often helps to share your wins with the broader group and to try to stimulate demand, to help create a ‘hunger’ for the data. Technology makes this much easier these days, with some fantastic tools and platforms available, such as visualisation and interactive tools for those who don’t need low-level access. A great way to enable this is to create self-service systems that allow easy access, such as via data marts. Ultimately, it should be everyone’s responsibility to share data.
2. Education I believe that it’s a key responsibility of all Data Scientist’s to help improve the Data Literacy of their organisations. It helps show the importance of data, and modern analysis tools and approaches, and also assists in guiding individuals and organisations in how to engage with data/analytics experts, and what to expect. An effective way that I’ve discovered to help educate the broader organisation is to become a data evangelist, by doing roadshows, proof-of-concept and hands-on demo’s.
3. Leadership Strong leadership is absolutely essential in enabling Data Culture. Without executive support, the whole endeavour will often be fruitless. The aim is to have senior executive support, ideally through a C-suit exec, such as a Chief Data Officer (who can set an example, and encourage the use of data for decision making), and also Technical Leadership for data and analytics teams. This is fundamental as technical people with strong business acumen have a unique understanding of both business challenges and current and emerging tech, which greatly helps to identify opportunities, and effective and efficient solutions.
4. Technical Resources By resources, I mean both people and tools! People: The challenge here is in both attracting and retaining the right people, and placing them in the right roles. This is why it’s imperative to have good technical leadership. Typically, you’ll need Data Scientists, Data Engineers, and a technical leader, at a minimum. Tools: Obviously, tools are absolutely vital but most important of all is to allow technical users the freedom to select from a range of tools that are best suited to the task at hand. This often means creating the right environment and scope for them to play with current and emerging tools and software. Think open-source, distributed data tools and DataOps.
5. Data Without data itself, the whole endeavour is obviously futile. The main challenges faced by many organisations include handling large volumes of data (such as increasing rates of ingestion/creation), and different types of data (ie unstructured data, mainly text, but also video, sound,… ie evolving beyond traditional RDBMS). This ultimately comes down to learning to innovate at scale. Some organisations benefit from developing a formal innovation function. Given that humans don’t scale, and that manual processes don’t work anymore, the idea is to help identify opportunities for aligning emerging tech with business challenges (which is a unique and exciting position for Data Scientist’s to be in), such as ‘operational’ areas that can benefit from automation (which are traditionally dependent on human manual processing).
Don’t forget the importance of developing a deep understanding of your entire data chain and analytics pipeline! You need to determine who ‘owns’ what, who the enablers and blockers are, how is the data used, and all the basic data fundamentals, such as access, quality, quantity, type and usefulness.
Finally, you also need to align and collaborate with all the data stewards within your organisation, and those in partner organisations that you may collaborate with – after all, these gatekeepers are often the secret to your success!