DataDirk

How You Can Experience the Power of LLMs in HR

We are likely two to three years away from Artificial General Intelligence (AGI). This is no longer a futuristic concept. It’s coming, fast. AGI won’t just be better tools: it has the ability to replace anyone behind a computer, including many roles in HR.

By Dirk Jonker – Founder at Crunchr

That’s why now is the time to experiment, learn, and build. This moment is an invitation for HR professionals to get hands-on with large language models (LLMs) and experience what they’re capable of. The goal is not to become a developer, but to understand just enough to shape what’s coming.

What we will build together

This is a simple AI assistant that helps people analytics teams get quick, meaningful insights from employee survey comments. It runs entirely on your own machine—no cloud, no API calls—so everything stays private and secure.

You just ask it open-ended questions in plain English, like “What do employees say about recognition in the Sales department?” and it responds with a concise summary based directly on what employees actually wrote in their survey responses.

Behind the scenes, it uses modern AI tools to understand language, organize your data, and generate answers—everything runs locally using Ollama and LLaMA3. The setup takes a few steps, but once it’s running, the experience is smooth and instant. It’s a great way to make open-text feedback accessible and actionable for your team.

This is more than a fun experiment. With AGI just a few years away, now is the moment for HR to get hands-on, build intuition, and lead the way. Jump on the train, or risk being left behind.

Setting up the environment on your computer

Step one is setting up your environment. I’m using Docker to run everything in a contained workspace. Think of Docker as a ready-to-use lab for running software—no setup headaches, no compatibility issues, no messy installs. It lets you run powerful tools like AI assistants in a clean, self-contained environment right on your computer. Everything you need is bundled together: the code, the AI model, and all the technical bits behind the scenes.

For people analytics teams, this means you don’t need to be an engineer to start experimenting with AI. With Docker, you can launch a local assistant that reads your employee survey comments and gives you clear, smart summaries—without touching the cloud or sharing your data. It works the same on every machine, and it keeps things private, simple, and fun.

This setup gives you your own AI sandbox, right on your laptop. It’s a low-risk way to build confidence with generative AI—and with AGI just a few years out, now is the time to start playing and learning.

Next, download Ollama to run the AI model locally. Think of Ollama as a simple, developer-friendly tool that makes large language models (LLMs) like LLaMA3 easy to use on your own machine. It handles all the loading, running, and chatting with the model—no API keys, no cloud latency, and no risk of leaking sensitive data. If ChatGPT runs in the cloud, Ollama is what brings that power directly to your laptop, under your control.

Finally, you’ll need a code editor—something like VS Code works perfectly. It lets you view and modify the scripts if you want, but you won’t need to code anything to get started.

The beauty of this setup is that everything happens locally, it’s fast, private, and fully contained. Just fire it up, ask your questions, and see what your people are really saying.

Once you installed Ollama, run this command. On the Ollama website you can see there are many other LLMs you can use. I’m choosing Llama3

ollama pull llama3

Setting up your project folder and adding the code.

Now that you’ve got your survey data in a CSV file, it’s time to give your AI assistant the structure it needs to run. Think of this step like setting up the workspace: folders, files, and logic that will bring your data and questions to life.

Start by creating a folder on your computer—call it something like survey-ai. Inside this folder, you’ll add:

Your survey CSV file (for example, survey_data.csv)
A few Python scripts that do the heavy lifting
A Dockerfile to build the container
A requirements.txt file that lists the Python libraries your assistant needs

Here’s what the folder structure will look like:

datadirk-survey-ai/
├── ask_survey_agent.py
├── index_survey.py
├── survey_data.csv
├── requirements.txt
├── Dockerfile

Let’s quickly walk through what each of the scripts does:

index_survey.py: reads your survey responses, splits them into chunks, generates vector embeddings, and stores them in a local ChromaDB database.
ask_survey_agent.py: takes your question, finds the most relevant survey chunks, builds a context prompt, and asks LLaMA3 (via Ollama) to generate an answer.
Dockerfile: defines the environment—Python version, dependencies, and where your code lives.
requirements.txt: lists the Python packages your assistant depends on (chromadb, sentence-transformers, pandas, etc.).

Here you can find the bare minimum code for each file:

index_survey.py

import pandas as pd
from sentence_transformers import SentenceTransformer

import chromadb
chroma_client = chromadb.PersistentClient(path="./survey_db")

# Load survey
df = pd.read_csv("survey_data.csv")

# Prepare chunks
chunks = []
for _, row in df.iterrows():
    if pd.notna(row.get("question")):
        chunks.append(f"{row['department']} department: '{row['question']}' - Response: {row.get('response', 'N/A')}, Score: {row.get('score', 'N/A')}")
    if pd.notna(row.get("comment")):
        chunks.append(f"{row['department']} department: Comment: {row['comment']}")

# Embed
model = SentenceTransformer('all-MiniLM-L6-v2')
collection = chroma_client.get_or_create_collection(name="survey_chunks")
collection.delete(ids=[str(i) for i in range(len(chunks))])

for i, chunk in enumerate(chunks):
    vector = model.encode(chunk).tolist()
    collection.add(documents=[chunk], embeddings=[vector], ids=[str(i)])

print("Your survey data has been indexed.")

ask_survey_agent.py

import requests
from sentence_transformers import SentenceTransformer
import chromadb
from utils import stream_to_text, summarize_with_llama

# Connect to Chroma and model
chroma_client = chromadb.PersistentClient(path="./survey_db")
model = SentenceTransformer('all-MiniLM-L6-v2')
collection = chroma_client.get_or_create_collection(name="survey_chunks")

# Ask a question
question = input("Ask a question about the survey: ")
query_vec = model.encode(question).tolist()
results = collection.query(query_embeddings=[query_vec], n_results=3)
context = "\n".join(results["documents"][0])

# Build prompt
prompt = f"""
You are an expert HR analyst helping interpret employee survey data.

Use the following context to answer the question.

Context:
{context}

Question:
{question}

Answer:
""" 

# Call Ollama
response = requests.post(
    "http://host.docker.internal:11434/api/chat",
    json={"model": "llama3.2:latest", "messages": [{"role": "user", "content": prompt}]},
    stream=True
)

if response.status_code != 200:
    print("Error from Ollama:", response.text)
    exit(1)

content = stream_to_text(response)
print("\nAI Answer:\n", content)

with open("qa_log.txt", "a") as log:
    log.write(f"Q: {question}\nA: {content}\n{'-'*40}\n")

summarize_with_llama()

Dockerfile

FROM python:3.10-slim

RUN apt-get update && apt-get install -y \
    git build-essential && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python3", "ask_survey_agent.py"]

requirements.txt

# Core dependencies
pandas==2.2.2
chromadb==0.4.22
sentence-transformers==2.6.1
requests==2.31.0
numpy<2.0

Once the files are in place, build your Docker image from inside the folder:

docker build -t datadirk-ai .

This will create a clean, self-contained AI assistant on your machine. No extra setup, no dependency issues.

Preparing your data.

The assistant works by reading actual employee survey comments and turning them into something searchable and meaningful using AI. But before it can do that, we need to organize the data in a simple format the assistant understands.

Start with a CSV file—a spreadsheet where each row is a comment from an employee. You can export this directly from your existing survey platform. All you need is a column with the comment text, and ideally a few extra columns like department, location, or role. That context helps the assistant give more targeted answers later.

Here’s a simple example of what your CSV might look like:

Comment	Department	Location
I feel valued by my manager and the team.	Sales	New York
Too much overtime lately, it’s burning people out.	Sales	Chicago
Communication from leadership could be clearer.	HR	London

Once your file is ready, the AI assistant will read each comment, break it into smaller, meaningful chunks, and turn those chunks into numeric “embeddings”, basically, smart fingerprints that help the AI understand what the text means, not just what it says. This is done using a language model called sentence-transformers. It’s part of the behind-the-scenes magic that lets the system respond to your questions with context-aware insights.

To store all this efficiently, we use a tool called ChromaDB. It’s a local vector database that remembers all those embeddings. Think of it as a smart filing cabinet for everything your people said, one the AI can search instantly, by meaning.

The best part? Everything stays on your machine. No cloud upload, no third-party processing. You get fast, private access to your data. And full control over how it’s used.

Launching your local AI assistant.

Now that your data is structured and prepped, it’s time to bring your assistant to life. This part wires everything together so you can start asking natural-language questions and getting real answers—directly from your survey data.

You’ll run a short script that does three things:

It loads your CSV file and turns each comment into a vector using sentence-transformers. These vectors capture the meaning of the text.
It stores those vectors into ChromaDB, your local search engine that can instantly find the most relevant comments based on the question you ask.
It waits for your question, finds the most relevant pieces of context, and sends it to LLaMA3, running locally through Ollama, to generate a smart, human-style answer.

To make this happen, we use Python scripts and Docker. Everything you need is already bundled inside the Docker container, so you don’t have to install or configure anything manually.

Here’s what you run to start indexing your data:

docker run -it --network=host datadirk-ai python3 index_survey.py

Once that’s done, start the assistant like this:

docker run -it --network=host datadirk-ai

It will ask you to type a question like: What do employees say about recognition in the Sales department?

And it will respond with a synthesized answer based directly on what employees actually said. No keywords, no filtering—just meaning, summarized for you in natural language.

This is where things get exciting. Suddenly, open-text comments become something you can use. You can dig into themes, compare departments, spot blind spots—and do it all in seconds, without waiting for an analyst or writing a single formula.