logo
|
Blog

    The Missing Piece of AI Transformation: Conversation Intelligence

    95% of AI pilots fail to move a revenue number. The pattern isn't about models. It is about a context layer most teams leave empty.
    Jun 27, 2026
    The Missing Piece of AI Transformation: Conversation Intelligence
    Contents
    Why did the agent not know?The real reason 95% failWhy this is harder in non-English meetingsFAQGet started

    "95% of enterprise generative AI pilots produce no measurable revenue impact."

    - MIT NANDA, State of AI in Business 2025

    Monday morning. An alert pops up on the dashboard.

    "Acme Co.'s PoC usage has dropped 30% over the past seven days."

    A CS agent auto-generates a response. "Offer a discount."

    The account owner smiles. Two weeks ago, in a PoC meeting with Acme, the CTO said this:

    "For our team to plug your agents into ours, your API has to work. We're stuck on the integration, so we're not really using the product. Security review is still pending too."

    This wasn't a price problem. It was an integration bottleneck. Sending a discount in that moment sends exactly one signal to the customer: "this vendor does not understand our situation." The account owner ignores the agent's suggestion, writes an email, books a technical support call, extends the PoC, and attaches an integration guide.

    The agent was wrong. But it's hard to call this the agent's fault. The single most important conversation the account owner had with Acme's CTO was never available to the agent in the first place.

    Let us replay the scene.

    The agent receives the same alert. This time, it does something different. It pulls transcripts from the three most recent meetings with Acme. It finds the CTO's line: "we're stuck on the integration, so we're not really using the product." It also catches that security review hasn't cleared, and that email response times have been getting longer. Then it gives its read:

    "Primary bottleneck: API integration complexity and delayed security approval. Recommend scheduling a technical support call, extending the PoC, and sending an integration guide."

    The account owner reviews the draft email the agent has written, and hits send.

    The difference between the two scenes is not the model. The model is identical. Models keep getting even better and cheaper over time. The only difference is whether the agent knew the information from that meeting.


    Why did the agent not know?

    In the previous article, we argued that AI transformation is built from five layers, and that most companies only invest in the back three: AI Model, Agentic Workflow, and User Interface. The first two layers, Context Source and Ontology, are usually left empty. The scene above shows what that gap actually costs.

    When companies start an AI transformation, the first move is almost always to wire up business tools via connectors: docs, issue tracker, CRM, chat. The assumption is that if these are connected, the AI will understand the company.

    It won't. Docs capture what was decided, not the discussion that led there. Issue trackers show what tasks exist, not why they were prioritized over others. CRMs show which stage the customer is in, not why they're hesitating.

    Today's connectors collect the outcomes of decisions. They don't collect the process. And the process is buried in conversations and meetings, where over 90% of the context vanishes the moment the call ends.

    "But we already use an AI notetaker."

    You probably do. But would that notetaker have helped the agent in the scene above? Most AI notetakers produce summaries for humans to read. They don't produce something an agent can search and reason over.

    A transcript is a record, not context. Even after audio becomes text, several more transformations are needed before an agent can use it. Speakers need to be separated. Content needs to be grouped by topic. Decisions and action items need to be extracted. Metadata for participants, meeting ID, and date needs to be attached. And finally it needs to be stored as a searchable vector. Only at the end of this pipeline do we get something worth calling LLM-Ready Data.

    An AI notetaker is a productivity tool that saves an individual's time. Conversation Intelligence is the infrastructure that fuels an organization's AI systems. The same audio file can produce two completely different categories, depending on how far you transform it. One is personal convenience. The other is an organizational asset.


    The real reason 95% fail

    When MIT NANDA analyzed more than 300 enterprise AI projects, the failed projects shared one thing: the AI never learned the organization's workflow and never adapted to its rhythm.

    The industry's vocabulary tells the same story. The phrase popularized by ChatGPT, "prompt engineering," has been quietly replaced by "context engineering." The real source of value has shifted. As OpenAI co-founder Andrej Karpathy put it: "Context engineering is the delicate art and science of filling the context window with just the right information for the next step."

    https://x.com/karpathy/status/1937902205765607626

    What separates a useful LLM from a useless one isn't the model. It's what you show the model. And before it's a technical problem, "what you show it" is a problem of organization and workflow.

    The 95% failure rate isn't far from the scene we opened with. In most failing AI transformations, the budget is spent on models, agents, and UIs. The real problem sits one layer below, in the context layer that determines what gets shown at all.


    Why this is harder in non-English meetings

    You might be thinking: "fine, let us just record meetings properly with a general-purpose tool."

    The most widely used speech recognition model has a word error rate of around 4% in English. In many other languages it rises sharply. Korean, for example, runs around 14% on the same model. In a business meeting, that gap is fatal. When 14 of every 100 words are wrong, some of them are names, amounts, or dates, and the transcript stops being a source of truth and becomes a source of error. This isn't unique to Korean; most non-English meetings inherit some version of the same problem.

    Tiro was built for exactly this. It supports 15 languages, including Korean and Japanese, and treats data protection as a default rather than an add-on: AES-256 encryption at rest, TLS 1.3 in transit, and AWS KMS-managed keys out of the box. Original audio is deleted immediately after transcription, and third-party AI providers are contractually prohibited from using the audio for training.

    The design started from one question. "What does a conversation need to become an organization's source of truth?" Tiro set three standards:

    • Transcription must be accurate, in whatever language the conversation happens in.

    • Security must be robust.

    • The output must be structured so an agent can read it immediately.

    In the "AI notetaker" category, "accuracy" usually means "how good is the summary." In the Conversation Intelligence category, the framing is different. "What error rate can we tolerate before this record is accepted as the organization's truth?" And “how structured can we make this output so other systems can use it?" That difference in framing is what shapes the product.


    FAQ

    How is Conversation Intelligence different from an AI notetaker?

    An AI notetaker is a productivity tool that saves an individual's time. Conversation Intelligence is infrastructure that turns extracted data into a structured asset for agents, CRMs, RAG systems, and other AI tooling. The same audio file produces two different things, depending on how far you transform it.

    Why is a basic transcript not enough?

    A transcript only converts voice to text. For an agent to actually use it, you need speaker separation, topic segmentation, decision extraction, metadata attachment, and embeddings. A text chunk without context can be searched, but not reliably used.

    What goes wrong if I use a general-purpose tool for non-English meetings?

    General-purpose models can be several times less accurate outside English; in Korean, for instance, the word error rate runs roughly three times higher. In meetings full of names, numbers, and dates, that gap turns the transcript from a source of truth into a source of error, and everything downstream (agents, CRMs, search) inherits the mistakes.


    Get started

    Share article
    Contents
    Why did the agent not know?The real reason 95% failWhy this is harder in non-English meetingsFAQGet started

    Tiro Blog

    RSS·Powered by Inblog