From Beta to Breakthrough: Scaling Health AI from POC to Production for Everyday Impact 

Generative AI is redefining preventive health by enabling personalized, actionable and dynamic experiences that bridge the gap between traditional clinical and episodic advice and helping individuals make healthier living choices every day. As health systems globally expand from “˜sick care’ to include “˜well care‘, these emerging technologies hold immense potential to engage, motivate and empower individuals to manage their health proactively. Despite this, transitioning AI-powered solutions from Proof of Concept (POC) to production is no small feat. On top of navigating technical and non-technical challenges typical of any platform that is put into production, the process requires careful design, multi-stakeholder and user validation, and a relentless focus on the end user’s experience founded in evidence-based behavioural health. 

 

Health Kaki:  A health companion, just for me 

Health Kaki1  (derived from the Malay word ‘kaki’ meaning buddy or companion) is a generative AI platform designed to empower individuals to take control of their health choices and lifestyle through personalised digital engagement. The Health Kaki POC was co-developed by Synapxe and Temus with support from Amazon Web Services (AWS) and input from Singapore’s Health Promotion Board (HPB) for Singapore’s Ministry of Health (MOH), to help enable HealthierSG’s health plan 

The platform harnesses the power of AI tools like Amazon Bedrock and the Anthropic Claude 3.5 Sonnet Large Language Model (LLM) to generate personalized diet and exercise plans from the rich resources and information across Singapore public health’s ecosystem. These plans align with users’ health goals, cultural preferences, and clinical recommendations for the user.   

What makes this so powerful is that technology today can enable near infinite permutations of choices and information to empower healthier living.   

For example,  

“I want to do Meatless Monday today. Health Kaki, change this recommended dish from chicken to tofu”¦oh and adjust the cooking times and nutrition information so I know my macros are on point.” 

“Health Kaki, I pulled a late night at work yesterday, can you find a yoga class near me? And by the way, does that class qualify for a PAssion Card discount for this class in my local Community Centre?” 

In short, the “holy grail” of the right intervention, to the right person, at the right time is finally on the horizon with the innovations in LLM and generative AI.  

Designing the Solution right, from Day 1 

It’s highly tempting to jump excitedly into new technologies and start experimenting.  After all, isn’t that what innovation and part of being “˜agile’ is about? We agree there is value and a time and place for that.  But if the true objective is to scale an AI solution from POC to a production then no.  

Instead, before spending time building a solution or letting what a technology can do today to guide your solution, focus on really understanding the needs, workaround and therefore the features or functions of a solution.  Specifically:  

1. Consider Viability: One of the critical aspects of transitioning a POC to production is ensuring its viability and that is founded upon what is the problem we are solving, how big is the problem and how does the solution address the underlying cause of the issue?  The market is rife with digital apps and platforms offering generic health and disease management solutions.  Vast majority have failed, sometimes quite spectacularly.  This is even more important in the context of LLM and generative AI, where costs to develop, run and operate have fewer precedents.  Costs can quickly run up as a product scales and individual engagement rises.

2. “˜Right size’ Evaluation and Testing: LLM and generative AI is emerging and also innovating at a speed we have not seen in the past.  Therefore, a clear understanding of system performance under varying conditions are required – and those need to change as the innovation progresses from a Proof of Concept, to a Proof-of-Value to Production. Health Kaki utilized scenario testing, metric-based evaluations, and iterative validation to refine its features and ensure reliability. Benchmarks like faithfulness scores and answer relevancy provided actionable insights into model outputs. With each advancement towards Production, the level of rigour and types of testing evolves and increases.  Only with this nuanced approach, rather than a one-size-fits-all testing for “˜safety’, can the ecosystem strike the right balance between innovation and safety.

3. Get away from the tired tropes of “˜users’ and UAT. Who is a user?  Technologist needs to appreciate that in health decision making and the actions is not a singular “user.”  For example, clinicians can make health recommendations (e.g., reduce sugar intake), patients can agree that is an important goal, an insurer or government can fund the service, but a caregiver could be the one that decides what a family eats for dinner.  Newer technologies may also elicit fear of the unknown or safety amongst any of those stakeholders, which also needs to be considered. Therefore, a digital solution in health must embrace these complex and potentially seemingly contradictory points of view in evaluation. 

To amplify that complexity, in user testing, patients may easily say in a situation they may not need this level of detail on how to reduce sugar intake.  But, when one is unexpectedly diagnosed with an acute health condition or several health conditions, they could or likely change their minds.  Our health and life situations change and therefore so does what “users” value and need. 

  

Overcoming Technical challenges 

Like with any innovation, there are a multitude of challenges, known unknowns and, more importantly, “unknown unknowns”. In developing Health Kaki, some of the technical challenges, listed below, were addressed by employing advanced prompting techniques and integrating a contextualized knowledge base. Rigorous model evaluations using benchmarks like faithfulness scores and LMSYS leaderboards ensured the Claude 3.5 Sonnet model consistently delivered reliable and context-aware outputs. 

  • Data Quality and Diversity: A cornerstone of successful AI scaling is ensuring the availability of high-quality, diverse datasets that reflect the population's nuances. For Health Kaki, this meant addressing Singapore's unique cultural and dietary landscape, including halal and vegetarian dietary habits, traditional Chinese medicine influences, and varied exercise preferences.  To tackle this, the team employed a hybrid human-AI approach. Data from reliable sources like the Health Promotion Board was enriched with metadata using LLMs. Human experts validated and refined this data, ensuring cultural relevance and contextual accuracy. This rigorous process laid the foundation for generating tailored recommendations that resonated with users. 
  • Infrastructure and Experience Design: Deploying AI solutions at scale demands a careful balance between robust infrastructure and seamless user experience.   For example, generative AI systems, while powerful, can often introduce computational delays that can impact real-time responsiveness. Health Kaki overcame this by adopting progressive loading strategies. These strategies provided users with engaging intermediate content while personalized outputs were being processed. Extensive user testing validated this approach, with participants appreciating the thoughtful UX design and clear progress indicators that minimized perceived wait times. 
  • Tools that Balance Accuracy and Scalability: Choosing the right models and tools is essential. Claude 3.5 Sonnet's demonstrated ability to balance personalization, scalability, and accuracy made it a cornerstone of Health Kaki's architecture. Metrics such as ease of use, processing speed, and flexibility were analyzed to understand the models' ability to handle evolving requirements and integrate seamlessly into the Health Kaki platform. 
  • Consistency in Personalization: For generative AI to build trust, recommendations must be both highly personalized and consistently accurate. Maintaining this balance across interactions posed a significant challenge for Health Kaki. 

Guardrails 

Implementing robust guardrails was a critical aspect of the Health Kaki project. These guardrails serve as essential safeguards, ensuring that AI-generated health recommendations remain appropriate, safe, and trustworthy. The Health Kaki team approached this challenge by collaborating closely with subject matter experts to define a comprehensive set of parameters. These guardrails encompass various aspects, including dietary considerations, exercise guidelines, health condition precautions, and lifestyle factors. By integrating these safeguards into the core of the recommendation engine, Health Kaki can deliver personalized wellness plans that are not only engaging but also align with each user’s unique health profile and needs. This approach demonstrates a commitment to responsible AI deployment in health technology, balancing innovation with the paramount importance of user safety and overall efficacy. 

 

From POC to Impact: A Blueprint for Success 

Scaling AI healthcare solutions demands more than technical prowess; it requires a comprehensive approach that integrates user-centric design, strong engineering capabilities and iterative validation. Health Kaki’s success in navigating these complexities underscores the importance of collaboration, careful planning, and a relentless focus on the complex web of user needs. 

As the project moves forward, the insights and learnings from continuous user validation will continue to shape the future development and expansion of the platform. The team’s focus on scalability, modular design, and continuous improvement positions Health Kaki to evolve alongside emerging AI technologies, ultimately driving an inclusive solution that can inspire and empower a wide demographic of residents to take steps everyday towards living healthier and happier.  

Recent Posts

4 April 2025

DSTA SDTS 2025: Strengthening Agility in Defence Transformation 

28 March 2025

Reflections from DSTA's Singapore Defence Technology Summit 2025

27 March 2025

From Beta to Breakthrough: Scaling Health AI from POC to Production for Everyday Impact 

24 March 2025

Driving Citizen-Centric Digital Transformation with Newgen

20 March 2025

Agility in Defence Contracting: The Future of Military Procurement