Last year, Sky Solutions hosted an internal AI Hackathon to generate practical, forward-looking solutions for our federal clients across health care, national security and financial services. This initiative empowered our associates to collaborate and brainstorm innovative technologies and services that can be tailored to the unique needs of the federal agencies we serve.
This blog kicks off a series spotlighting some of our team’s breakthrough solutions and strategies that reflect the ingenuity, teamwork, and client-first mindset that define Sky Solutions.
At Sky Solutions, we use technology to create real-world benefits for both internal teams and end users. During our AI Hackathon, the team’s exciting concept for a new program designed to transform public government datasets into conversational, user-friendly tools captured our attention for its bold vision. Thus, Sky Solutions’ groundbreaking Talk to Me About the Data program, powered by large language models (LLMs), was born.
The Challenges of Public Data
The U.S. federal government maintains one of the world’s largest collections of open datasets through platforms such as Data.gov. From health statistics to financial closures, this data is taxpayer-funded and publicly available. However, accessing and making sense of it often requires technical expertise to download raw CSV files, write SQL queries, and build pivot tables in spreadsheets. Thus, for the average citizen or policymaker this valuable data remains largely inaccessible.
The Idea: From Static Files to Smart Conversations
What if we could query government data and get an immediate, accurate, plain-English response?
Talk to Me About the Data proposes doing exactly that. By embedding government datasets into a chatbot interface powered by LLMs, similar to ChatGPT, the program would allow users to engage in natural conversations with public data.
As a proof of concept, the project focused on a dataset from the Federal Deposit Insurance Corporation (FDIC) that tracks failed banks in the U.S. Rather than digging through the raw spreadsheet, users could ask, “How many banks closed in Texas in 2023?” and get a contextualized, precise answer from the AI.
Behind the Scenes: Building the AI Assistant
To bring this vision to life, the project involved several technical steps:
- Data Preparation: The FDIC’s CSV file was parsed using a simple Python script. Information such as bank name, location, closure date, and acquiring institutions was extracted.
- Prompt Engineering: Since LLMs do not inherently “understand” data structure, the model needed custom prompts to provide context. These prompts helped define the model’s role (for example, as an “FDIC research assistant”), and taught it how to interpret and respond to user questions.
- Model Training: Using Meta’s open-source Llama 3 model, the AI was trained on these prompts through Google Colab. The training employed a low-rank adaptation (LoRA) method, which finetuned the base model to the specific dataset without requiring extensive resources.
- Challenges and Learnings: One key insight our Sky team gained throughout this project was that more training isn’t always better. In fact, fewer training cycles produced more accurate, targeted responses. The experiment also highlighted the importance of selecting the right base model and carefully designing prompts to preserve data integrity and context.
Why This Matters
Sky Solutions’ Talk to Me About the Data project is more than a technical exercise; it’s a step toward ensuring public data is accessible for all. By making government datasets conversational, we empower citizens, journalists, researchers, and decision-makers to ask meaningful questions without needing a data science degree or expensive development resources.
This idea has clear applications beyond the FDIC. Imagine interacting with public health statistics, census data, or environmental records as easily as chatting with a virtual assistant. This approach democratizes data and transforms static repositories into living, accessible knowledge bases.
What’s Next
Now moving into the proof of concept phase, our Talk to Me About the Data team is working to refine the interface, expand to include additional datasets, and explore broader adoption across agencies. With further development, this platform could serve as a model for modernizing public data interaction across the federal landscape.
At Sky Solutions, we believe innovation isn’t just about creating new technology in a vacuum; it’s about transforming what government agencies can offer the public and their own internal teams. Sometimes that starts with simply asking the data to talk to us.