Data Is a Human Thing
Data isn’t just a technical artifact - it’s a linguistic one. Humans think in language, speak in language, and reason through language. Every dataset is a shadow of human expression, and the closer our representations align with natural language, the more usable, interpretable, and interoperable they become.
This isn’t just about readability. It’s about cognition. Natural language is the native format of human thought—and increasingly, the native substrate of large language models (LLM). When data is structured in ways that mirror how people speak and reason, it becomes more accessible to both humans and machines. That’s where truenumbers come in.
What Are Truenumbers?
Truenumbers are data values expressed as atomic sentences, each with a subject phrase – the entity being described; a property phrase – an attribute or relationship of the subject; and its value – the quantitative or qualitative data item itself. For example:
The Earth has approximate diameter = 8000 miles
Hyundai Ioniq 5 EV has customer rating = excellent
Tokyo has population = 13.5 million (2023 estimate)
These aren’t just facts - they’re linguistic units of meaning - parseable, composable, and context-aware. They bridge the gap between structured data and natural language, making them ideal for LLM ingestion, retrieval-augmented generation (RAG), and semantic search.
Why Truenumbers Matter
For Humans: They align with how we speak and reason. They’re readable and relatable. The data is its own metadata.
For LLMs: They provide semantically rich input that preserves context, intent, and structure. No need for brittle schema mappings or prompt gymnastics. They can also be written by LLMs to provide more usable outputs.
For systems: They enable composability, traceability, and explainability. Each truenumber is a modular unit of knowledge.
What Truenumbers Does Better
Data integration - Any kind of data represented in a common, intelligible language that grows with your requirements
Data portability - Truenumbers are individual facts you can take out of the database and use in documents, web pages and programming without loss of integrity or fear of tampering. Like this one right here 44.01 (go ahead, click on it)
Real-time distributed systems - Being portable and self-describing, truenumbers can flow among processes on sockets or Kafka. Truenumbers’ change-driven process model enables scalable real-time data flows with unprecedented ease
Thinner more universal clients - Truenumber data needs less code to make it intelligible to users, and because all truenumbers have the same structure, applications are interoperablel. That means less knowledge embedded in code, and more in the data
Analytics, AI and Machine Learning - Because truenumber facts put more knowledge in the database, in one language across all domains, analysis has more to chew on. Being language-based, Truenumbers bridge the gap between structure data and LLMs.
Compliance and security - Truenumbers are immutable. Their unique identity ensures traceability and integrity. Every truenumber has its meaning and identity on board, allowing verification and validation with one click. Secure data even in insecure environments.
Flexible storage options - Truenumbers relies on whatever persistence technology is appropriate. Truenumbers uses MariaDB or MongoDB in it’s standard configurations, but can be adapted to any storage technology with minimal disruption because a truenumber is a truenumber no matter where it’s stored
Truenumbers is helping build next-generation systems with the Naval Information Warfare Center, San Diego, helping grad students manage data for statistical analysis in Boston University’s MSSP Program, and working with many other enterprises.
Contact us for a briefing to learn how we can give data back to its rightful owner - the user.