![Data is at the heart of today’s government services. This is reflected in the federal government’s Data and Digital Government Strategy (the Strategy), which highlights its goal to use data, analytics, and technology to deliver simple, accessible services for people and businesses by 2030. As noted in the strategy, Australians expect personalised, integrated, and easy-to-use services from government entities they engage with. Such personalisation, especially across digital channels, is heavily dependent on data. Delivering such services becomes more effective when the data is more accurate and up-to-date. This is where real-time data comes into play. Why? Real-time data is more accurate because it is always up-to-date. This, in turn, improves the customer experience by enabling services to be more dynamic and interactive. However, because batch processing still accounts for the majority of data processing in government ranks, even the most recent data may become outdated when used to deliver government services. Engage with data in motion Batch processing is when the processing and analysis happen on a set of data that has already been stored for a period of time. This may be days, weeks, or even months, which just doesn't cut it for delivering dynamic and interactive citizen services. In recent years, data streaming has emerged as the technology that allows organizations to tap into their data in real-time in order to improve citizen engagement and experience. Event streaming, another name for data streaming, describes the continuous flow of data as it occurs. This enables true real-time processing and analysis for immediate insights. Streaming data distinguishes itself from batch processing by delivering the most up-to-date information when required. Apache Kafka, one of the most successful open source projects, is used by over 70% of Fortune 500 companies today and is well recognised as the de facto standard for data streaming. The open-source nature of Kafka lowered the entry barrier for working with streaming data, allowing companies to easily build use cases and solutions. However, as with all open-source software, there are limitations. Companies often end up spending more to efficiently manage, scale, secure, and evolve the streaming infrastructure. Why are we still using batch processing if data streaming is the future? Batch processing is still simpler to implement than stream processing, and successfully moving from batch to streaming requires a significant change to a team’s habits and processes, as well as a meaningful upfront investment. That is why Confluent has rearchitected Kafka to create a complete platform that provides a fully managed, cloud-native data streaming solution with the ability to turn data events into outcomes, enable real-time apps, and empower teams and systems to act on data instantly. Personalised for the people Confluent’s ability to utilise data as a continually updating stream of events rather than discrete snapshots means that public sector organisations can leverage data streaming to improve citizen engagement by offering personalised, data-driven services and insights. Confluent’s data streaming platform also enables real-time monitoring and analysis of government services and infrastructure, allowing public sector entities to quickly respond to critical events such as natural disasters or public health emergencies. At a more mundane level, Confluent supports data sharing and collaboration among government agencies, facilitating the seamless exchange of information to serve the public better and optimise resource allocation. And, importantly for government organisations, Confluent’s data streaming capabilities can enhance cyber security by detecting and mitigating threats in real time and safeguarding sensitive government data—a critical element in maintaining our national security. Indeed, 53% of Australian businesses surveyed in a recent Confluent study cited security and compliance awareness as the most applicable use cases for data streaming. It should come as little surprise, then, that industry analyst firm Forrester views Confluent as “an excellent fit for organisations that need to support a high-performance, scalable, multi-cloud data pipeline with extreme resilience.” Streamlining service improvement Data streaming is driving greater efficiency in more than three of four companies across Asia Pacific, according to Confluent research. Meanwhile, 65% of IT leaders polled see significant or emerging product and service benefits from data streaming. With this in mind, the potential for the government to do more with its data is clear, and personalisation is top of mind. Personalising citizen service experiences requires knowing who a customer is at any given moment. This is made possible by accessing data in motion, especially across multiple touchpoints. At the very least, this can help citizens avoid having to provide the same information over and over again as they interact with government agencies. And now, with Confluent assessed under the Australian Information Security Registered Assessors Programme (IRAP), government agencies with an Information Security Manual PROTECTED level requirement can use Confluent Cloud across Amazon Web Services (AWS), Google Cloud, and Microsoft Azure. Australian government agencies will then be able to gather and share data across departments, offices, and agencies securely and at scale. This means even more government agencies will be able to tap data in motion to integrate information across their applications and systems in real time and reinvent employee and citizen experiences for the better.](https://publicspectrum.co/wp-content/uploads/2024/05/Confluent-Advertorial.png)
In today’s fast-paced digital era, it is key for organisations to have a strong data strategy in place when implementing an AI strategy. The growing dependence on AI for decision-making and strategic planning highlights the importance of precise, dependable data. Nevertheless, the present situation presents considerable obstacles, such as concerns about data privacy, problems with data quality, and the requirement for proficient data analysts.
Public Spectrum has caught up with Theo Hourmouzis, Vice President for Australia and New Zealand at Snowflake, who has extensive experience in addressing this important matter. Hourmouzis’ extensive experience in data management and AI implementation uniquely positions him to offer valuable insights into this pressing issue.
Hourmouzis, a seasoned expert in the field, has a proven track record of helping organisations navigate the complexities of data management. He has extensive experience in multiple industries and has played a key role in encouraging organisations to embrace AI technologies by fostering a data-centric culture.
Theo Hourmouzis discusses the importance of a strong data strategy for organisations looking to develop an AI strategy.
There is no AI strategy without a data strategy. The first thing to understand about AI is that it is entirely dependent on the data upon which it is fed. Generative AI (GenAI) and Large Language Models (LLMs) may seem sentient, but they aren’t capable of thinking or reasoning. What they do, and they can do it extremely well, is find patterns in data and predict what should come next.
This predictive capability is based on the data it has access to. If that data is incomplete or managed incorrectly, then the results AI delivers cannot be trusted. Not only is there the risk of hallucinations when it provides incorrect analysis of the data, but there is also the possibility of sensitive or personally identifiable information being exposed.
In the rush to AI, there are three common pitfalls I see organisations make: overlooking the governance, quality, and completeness of the data being fed into an AI model.
Data governance is critical to any successful AI initiative. One of the major benefits of AI is that it promises to democratise data access, but this brings with it the risk of sensitive data leakage. Having guardrails around your data, like role-based access controls (limiting access based on the seniority, department, or function of each employee), data masking (obfuscating personally identifiable information), or data clean rooms (allowing collaboration without disclosing raw data), can greatly mitigate this risk.
Data quality cannot be overlooked in an AI strategy. If the quality of the data being fed into a model is out-of-date, miscategorised, or otherwise incorrect, then no results can be trusted. In other words, ‘rubbish in, rubbish out’.
Having a complete view of an organisation’s data ensures any answers delivered by AI are made in complete context. Organisations typically have myriad data silos and sources; in the public sector, for example, this data dispersal is pronounced due to the number of agencies, departments, and ministerial offices. These need to be collapsed and brought to a central location to ensure AI has a complete view of the organisation. Further, once data is stored in a central location, ensuring governance and quality becomes much easier as policies can be implemented and managed centrally.
The most critical component of a successful data strategy is centralising data within a single data platform. This breaks down the silos that exist within an organisation and makes the challenges of implementing AI much easier to manage.
When data is all centrally located, governance policies can be implemented and managed universally, ensuring there are no gaps. Additionally, security measures such as role-based access controls and data masking for personally identifiable information are much easier to maintain.
Another benefit is that it accelerates AI innovation within the business. Rather than the traditional model of bringing data to the AI application and repeating this process for each use case, this approach allows the business to bring the app to the data.
As that data is readily available, governance is already assured, and security measures have already been implemented, innovation isn’t impeded.
By bringing all an organisation’s data together within Snowflake’s AI data cloud, data silos are eliminated and architectures are simplified. This enables value to be extracted from data much quicker.
Further, once data has been centralised, we provide seamless access to AI and Machine Learning via the Snowflake Platform, the Snowflake Marketplace, and our partners. This radically simplifies the process of innovating with AI, as pre-built models and applications can be deployed with just a few clicks. Earlier this month, we also announced Snowflake Arctic, a powerful enterprise-focused LLM organisation that organisations can use to develop their own conversational copilots and chatbots. It will be available through Snowflake Cortex, a fully managed service that offers machine learning and AI solutions to Snowflake users. These features are currently available in some regions and are being rolled out to most customers across the globe over time.
The biggest trend I see emerging is the democratisation of data across organisations. In days gone by, data scientists and analysts had complete dominion over how an organisation’s data would be used, and typically only they had the training and know-how to use it.
But as data becomes accessible and AI applications are made available in the same way you’d download an app on your phone, suddenly the barrier to entry is much, much lower.
This will enable non-technical employees—say someone in marketing, HR, or even those at the executive level—to very easily spin up their own AI to extract the insights they need.
The first step towards this shift is bedding down the data strategy and ensuring non-negotiables like governance and security blanket all your data. Once this has been put in place, the possibilities are endless.
Justin Lavadia is a content producer and editor at Public Spectrum with a diverse writing background spanning various niches and formats. With a wealth of experience, he brings clarity and concise communication to digital content. His expertise lies in crafting engaging content and delivering impactful narratives that resonate with readers.