Back

Schema-free AI: How LLMs query unstructured and semi-structured data sources

Akash Chandrasekar

June 22, 2026
Table of contents

Most LLM query systems are built assuming clean, structured data but real enterprise environments rarely deliver that. Schema-free AI is the approach that lets LLMs reason reliably over MongoDB collections, REST APIs, and Elasticsearch even when the data has no consistent structure. This article covers how it works and what your query layer needs to support it.

Traditional query engines will generally fail when dealing with dynamic data sources due to a lack of consistency in those data sources. This creates a significant problem for query systems that are built around Large Language Model (LLM) systems because assumptions made about the structure of the data will result in inaccurate or incomplete results.  

This report will discuss schema-free AI and its variability and classify the different data sources based on schema stability and will outline the requirements for implementing a layer for building reliable queries. It will also utilize AI Matic and D2D as examples of how to efficiently deal with a dynamic enterprise data environment through an established production tested method.

The schema-free AI assumption problem

Most production implementations of Text-to-SQL rely on the same underlying assumption there is a fixed, discoverable, and typed schema from the datasource to be able to generate valid queries using a LLM given the schema-free AI to the model.  

The more context provided to the model, the more likely it will produce accurate output. This holds true only for a very small number of sources, namely, relational databases that have up-to-date documentation, have standardized naming conventions, and whose table schemas are not changing. Other types of sources will present failure modes that must be handled in the system architecture.

Schema assumption breach

There are four typical ways schema assumption is violated in the real world:

  • No Schema: Some sources lack any traditional schema. Documents in the same collection can differ completely, with no required fields or set data types. Others carry an implicit structure that only becomes clear after looking at actual samples. REST APIs often work this way, returning defined shapes without offering a separate metadata source to check.
  • Implicit Schema: Even when a structure exists, it can shift over time. Tables gain columns, data types change, or fields that once required values start accepting nulls. Systems that cache schema-free AI details then rely on outdated information and return wrong query results.
  • Schema drift: In cases with multiple sources, the same business idea may appear in conflicting forms. These differences must be resolved before a query across sources can succeed.
  • Schema conflict: Multiple sources need to be queried in combination, but they use different naming conventions, type representations, or structural patterns for the same business concept. Joining across them requires resolving these conflicts before the model can generate a valid cross-source query.

The engineering challenge relates not to these databases' more difficult nature of querying, but rather, since the toolset typically used for generating queries with LLMs was not created with these types of databases in consideration; therefore, converting the existing toolset will take intentional architectural decisions on each layer of the stack.

Why this matters for LLM query systems

LLMs create queries by reasoning over the schema details they receive. The success of those queries hinges on how accurate and complete that schema information is. When the schema is missing, wrongly inferred, or outdated, the generated queries may look correct but return wrong results or fail during execution.

These failures often stay silent. A query might run successfully on a MongoDB collection yet pull data shaped differently from another collection, producing no error messages. Spotting the issue requires knowing both the intended question and the actual data layout, which makes it a semantic problem rather than a simple syntax one.

Systems limited to structured data sources can use validation tools and dialect checks to catch many such errors. Mixed enterprise environments lack that option. They need methods that do not assume a perfect schema upfront and continue working reasonably even when schema details are partial or incorrect, which is a core challenge addressed by AI data analytics platforms.

Source taxonomy: Classifying data by schema-free AI certainty

Classifying data sources by how well they support schema details proves more useful than grouping them by the technology that created them. Two SQL databases, for instance, may offer similar schema-free AI reliability yet differ sharply based on how they were built and kept up. A well-documented MongoDB collection often gives clearer schema information than a legacy PostgreSQL database full of undocumented columns and unexpected null values.

This approach sorts out sources according to the schema details available to the query system during runtime. It also shows how much extra help each type needs from the system to produce reliable results.  

Source Category

Characteristics and Query Layer Requirements

Fully Structured

Schema is declared, typed, stable, and introspectable. SQL databases with enforced constraints and current documentation. The query layer can rely on schema information as ground truth for query generation.

Semi-Structured

Schema is partially declared or can be inferred from data but is not enforced across all records. MongoDB collections, Elasticsearch indices with dynamic mappings, Parquet files, and structured log formats. The query layer must treat schema information as probabilistic rather than definitive.

Schema-Inferred

No declared schema exists. Structure must be derived from sample data, API documentation, or response introspection. REST APIs and external data feeds fall here. The query layer must build a working schema model from indirect evidence.

Schema-Absent

Structure varies per record with no consistent pattern. Raw document stores, unprocessed event streams, and legacy flat files with mixed formats. The query layer must handle structural uncertainty as a first-class condition, not an edge case.

Most enterprise analytics environments have sources in all four buckets. A system for queries that only handles the first category, fully structured, covers only a fraction of the real source landscape.

Implications for the metadata layer

Different source types shape how the metadata layer supplies context to the query model. Extracting metadata from fully structured sources is straightforward through schema introspection and formatting. For other categories, engineering choices around metadata extraction strongly influence query quality. Semi-structured sources need decisions on sample size, handling conflicting field types across samples, and showing optional fields.  

Sources with inferred schemas require parsing documentation or responses, with the resulting metadata clearly marked as inferred rather than declared. Sources without any schema-free AI demand that the metadata layer signals structural uncertainty to the query model instead of forcing an inaccurate one.  

These are not minor edge cases solved by a simple fallback. Each category brings its own engineering needs, and the difficulty grows when supporting queries that span multiple source types at once.

The metadata enrichment layer

Metadata enrichment takes basic schema details like table names, field names, data types, and structures from a source and adds semantic meaning to them. This extra layer connects the raw structure to what the model actually needs for creating valid queries. Fully structured sources often carry enough meaning in their raw form. A field named customer_id of type INTEGER in an orders table usually gives clear enough clues about its purpose.  

Semi-structured and inferred sources work differently. Their raw details frequently lack clarity or even mislead. A field called its in a MongoDB document might hold a Unix timestamp, an ISO date string, or something else entirely. An API response field named status could be a boolean, a string value, or a numeric code. Since models rely heavily on field names in these environments, and those names alone are not dependable signals, queries can easily turn out invalid without added context and meaning.  

What enrichment addresses

Three issues often arise when working with raw schema information, even when applying standard practices for metadata.  

Field names tend to be ambiguous. Business terms rarely match technical names exactly, especially in systems built over years by different teams. Enrichment replaces those technical names with clear descriptions of what each field means for the business, reducing dependence on inconsistent naming.  

Document stores like MongoDB bring another challenge through varying structures. Enrichment records not just whether a field exists, but also its typical value patterns, nesting depth, and which documents contain it. This helps the model build queries that work even when document layouts differ or change.

REST APIs create gaps too. Their documentation usually covers endpoints, parameters, and basic response shapes, but rarely explains how the business uses the data or under what conditions certain fields appear. Enrichment adds the missing context on usage and connections between responses.

Enrichment and semantic retrieval

Semantic metadata enrichment improves context for queries, but it does not solve large context window limits by itself. In environments with hundreds of tables, collections, or API endpoints, including all enriched metadata for every query would create too much noise and reduce generation quality. Semantic retrieval addresses this by storing enriched metadata as vector embeddings.  

At runtime, the system uses the user’s natural language question to pull only the most relevant enriched metadata. This keeps the schema-free AI context focused and compact, no matter how large the overall environment is, but the quality of the enriched descriptions matters directly. Better descriptions increase the chance that the user’s question matches the right metadata, leading to more accurate retrieval than raw technical names would allow.

Metadata enrichment and semantic retrieval are not optimizations on top of a working system. They are prerequisites for reliable query generation against schema-free sources. Without them, the model is generating queries against a vocabulary it does not understand.

What the query layer requires

When creating a query layer for schema-free AI data there are different requirements than when creating one for relational databases. Below are the functional requirements for each source category that will be satisfied by the query layer. The implementation approach can differ by source category, but the requirements are independent of any orchestration framework or tool used.

Planning before generation

When dealing with structured sources of data that have all their metadata available, it's possible for a model to generate a query directly from the user's natural language question. However, when dealing with schema less sources, planning needs to be done first, to make visible any assumptions made by the model about the structure of the data source prior to encoding those assumptions into a generated query. This planning-and-execution pattern is a key component of Agentic AI Accelerator. The result of planning is a "checkpoint" for structural uncertainty, where it can clearly be documented that one or more fields will not exist in every record, or that the shape of the response from the REST endpoint will vary based on the parameters passed into the request.

If not planned, the assumptions regarding the structure of the data source are implicit, and when the query fails at execution time, the amount of diagnostic information available regarding why the query generation failed will be minimal.

Query type diversity

Each category of data has its own unique query language and execution model. Relational databases are queried via SQL. SQL is expressly designed to read data which is structured within relational tables. Document stores like MongoDB utilize a separate query language to query the stored documents. In contrast, APIs receive and return data to external systems via the use of REST.  

To allow seamless integration of disparate data sources, the query layer must normalize the query results to provide a consistent view of the data to all elements of the downstream processing stack (visualization, insight generation, user-facing outputs) without requiring data consumers to differentiate the data source type when consuming query results. The need to normalize the query results has significant implications for the architecture of the system.  

The query generation and execution logic for each data source must be isolated from each other to avoid the propagation of the source-specific differences throughout the architecture of the system.

Bounded retry on failure of schema-free AI

Schema-free AI data sources experience a greater number of query generation failures than fully structured data sources do due primarily to less confidence in the metadata and more chance of making an incorrect structural assumption. As described previously, the query layer of a platform (an application framework for developing new applications) requires a mechanism that allows for retries of failed executions using a defined boundary; these mechanisms will return execution failures to the query generation step for another attempt.  

Using an unbounded retry mechanism is not an appropriate approach to accommodate production data sources because excessive amounts of time spent attempting to execute failing queries can lead to resource consumption and potentially produce negative consequences in terms of API-based services that have a limited amount of allowed requests.

The maximum number of retries that can be attempted should be configurable on a per-deployment basis, and the system should report detailed information regarding both generation and execution failure for purposes of diagnosing the error and subsequently facilitate improvements going forward.

For example: "In practice, a limit of three retries prevents a misconfigured MongoDB enrichment from consuming an entire API rate-limit quota in a single session."

Structured output consistency

Depending on the model versions and providers, the output format of LLM can vary. Models and their versions also change quite often, meaning that the same query made against an LLM may return to different formats. Query systems, therefore, need to have fallback logic that accounts for failure to provide the expected structured output and does not propagate the failure into the expected response contract.  

This requirement is usually less visible than that related to query accuracy but is one of the more common production failure modes, especially when moving from one LLM provider to another or when an LLM provider modifies how their model behaves without a corresponding version update.

For example: "A provider silently updating their model's output format has caused downstream parsing failures in production systems with no model version change logged."

Failure modes specific to schema-free AI sources

Querying across schema-free AI sources can lead to more failures than structured source-based queries. In this document, you will find information on each of these failure modes present in current live schema-free query systems, what causes each of these failures, and how to design a system to address the corresponding failure modes.

Failure Mode 

Description and Mitigation Requirement 

Stale metadata 

When enriched metadata is created at a certain moment in time, if the source's schema changes (for example, new fields are added, the shape of documents changes, or the format of the API's response changes) then the enriched metadata will no longer match the original data. So even if a query runs against old metadata it may return results that are not what the query was originally looking for. To prevent this problem, you need to implement a mechanism to update the enriched metadata and have a way of monitoring for an error in the schema to track how often this happens. 

Embedding drift 

The quality of semantic retrieval will degrade if the embedding model changes after the metadata collection has been created. The same query can return different schema contexts before and after an embedding model change, which can create inconsistencies in query performance without any indication in either the source or the query layer. Mitigation of this issue requires that existing metadata collections be re-embedded whenever the embedding model changes. 

Type coercion errors 

When the same conceptual field can be represented with multiple runtime types within semi-structured database records, discrepancies arise due to the change of runtime type over time from one implementation to another. For example: A date field may be stored in older records as a string and in newer records as a date object; therefore, executing a query that is not aware of this difference will result in type errors. To avoid these types of errors from occurring, type coercion needs to be performed at the query layer, and enriched metadata will contain documentation for those places where type variance is present within semi-structured databases. 

Nested document traversal failures 

MongoDB and similar document stores support arbitrarily nested structures. Query generation for nested paths requires the model to have accurate information about nesting depth and field availability at each level. Enriched metadata that flattens nested structures for readability can cause the model to generate traversal paths that do not exist in the actual data. 

API response variability 

REST API responses frequently include or omit fields conditionally based on request parameters, authentication scope, or data availability. A query generated from documentation-derived metadata may reference fields that are not present in the actual response for a given request. The query layer must detect missing fields in responses and return this information to the generation step rather than passing incomplete results downstream. 

Cross-source schema conflict 

Queries needing data from multiple sources need to manage schema conflicts between those sources. For example, two different sources may both represent the same business concept via different field names, or different type conventions, or different levels of detail. The query layer must therefore expose these instances of conflicting representations instead of simply providing results that combine incompatible representations in a covert manner. 

Composite scenario: Multi-source intelligence platform

The following scenario indicates an example of production from patterns gathered from many cases. The scenario is an example and not from one client specifically. The scenario shows how the prior sections on document taxonomy, platform requirements, and failure modes are illustrated on a production deployment.

Scenario context of schema-free AI

Composite Scenario  

Sector: Financial services operations  

Sources: MongoDB event store (transaction and interaction records), two REST API feeds (position data, compliance flags), Elasticsearch audit log.

Prior capability: Analysts queried each source separately using source-specific tooling; no unified natural language query capability; cross-source questions required manual data joining.  

Goal: Enable operations analysts to issue natural language questions across all four sources and receive consistent, source-attributed results, similar to the capabilities demonstrated in GoML's AI hedge fund software platform.

Source-by-source engineering observations

Each source in this scenario presented distinct schema challenges:

MongoDB Event Store: A collection of records representing the entire system evolution for seven years, with early records having a flat structure of 12 fields, while some of those fields were also used in nested objects for regulatory metadata, multi-leg transaction arrays, and removal of no-longer in-use original fields from documents retained in the collection, resulted in enrichment requiring sampling across each of the four record-age cohorts to capture all types of observed document shape.  

If only recent record samples had been taken, you would have metadata from enrichment that does not consider the volume of queries directed at historical data.

Rest API feeds: Both REST API sources included documentation, yet it described the intended response structure rather than the actual one. One API simply left out fields when their value was null.  

The other changed its response structure depending on the authentication scope of the credential used. Accurate enrichment required sampling live responses under real production authentication conditions to capture how the APIs behaved.  

Elasticsearch audit log: Due to dynamic mapping being used throughout the indexing process of the Elasticsearch index, the same field name was mapped to different field datatypes among different shard indices.  

Queries attempting to treat these field data timestamps as a consistent single-type produced type error(s) with some of the index's older shard indices.  

Cross-source query behaviour

Cross source queries were the most technically challenging portion of this overall engagement. The MongoDB Event Store and the Elasticsearch audit log both had records of transaction events but used different identifiers, timestamp formats, and terms to describe the same type of event.  

For example, to answer a query about flagged transactions, it took creating an additional derived matching logic to correlate the records from both the MongoDB Event Store and the Elasticsearch audit log rather than performing a direct key join on the two records. Once enrichment was performed on single source queries within the query layer, they worked reliably.  

However, to query across sources, additional prompt engineering was required to highlight the schema conflicts present between the two sources. The prompt engineering was used to guide the model towards producing the correlation logic as opposed to assuming that there was a common key for all records returned. Cross source queries are consequently the type of queries most likely to produce the highest percentage of generation error rate in schema-free multi-source deployments.

Outcomes

Dimension 

Result 

Time to first reliable query 

Single-source queries operational within two days of enrichment completion for each source 

Cross-source query accuracy 

Above 85% on queries requiring two sources; lower on three- and four-source combinations pending multi-collection retrieval improvements 

Metadata refresh cycle 

Weekly automated re-enrichment implemented for all sources; drift detection alerts configured for schema-breaking changes 

Failure mode encountered 

REST API response variability caused generation errors until live sampling replaced documentation-derived metadata; resolved within one day of diagnosis 

Analyst time impact 

Manual cross-source joining eliminated for the 70% of questions that could be answered by querying two or fewer sources 

Frequently asked questions by data architects and technical leads

The following addresses questions raised during technical evaluations of schema-free AI query systems. They reflect concerns from data architects, platform engineers, and technical leads working across heterogeneous source environments.  

On schema-free AI governance and enrichment  

We have a schema registry. Can the enrichment layer use it as an input?

Yes, a schema registry can be a better starting point for enrichment than a source introspection can be, particularly based on using Avro or Protobuf defined schemas in an event stream environment (i.e., support reliable execution). The enrichment operation will "enrich" the content of the registry with both semantic descriptions and business vocabulary translations, rather than replacing the registry's content.  

If the source does not have a registry, the source will either require a method of retrieving metadata to do introspection, or a method of processing documentation to help perform an enrichment.

How do we handle enrichment refresh when source schemas evolve continuously?

Schema changes happen steadily, so enrichment needs more than simple scheduled updates. The effective method mixes regular refreshes with change detection. The system watches for shifts like new fields, type modifications, or altered document shapes, then refreshes only the affected parts.  

Full refreshes cost too much at scale. Updating just the changed sections keeps metadata accurate without unnecessary work. How often to check and refresh depends on how quickly each source evolves.  

Our MongoDB collections have no enforced schema at all. What does enrichment look like for truly schema-absent data?

For collections that do not have any schema, the process of enriching them moves from documenting the expected values for any of the fields to documenting the patterns of values for the fields. The process of enriching these collections involves sampling documents, identifying which fields are present on the majority of the documents, documenting the value patterns for the identified fields and explicitly identifying fields that are present in only a subset of the documents.  

Therefore, the resulting metadata will provide a description of the typical shape of a document as opposed to having to be defined by a schema. Additionally, when generating a query for a schema-less data source, the query generation prompt is designed to reflect field optionality. The query generation for a schema-less data source has a higher degree of uncertainty than for querying a data source that is data enriched through a schema. Therefore, the way in which these sources present the results to end-users should be considered the amount of uncertainty in the results.

On production reliability of schema-free AI

What happens to query behaviour when the embedding model is updated?

When an update occurs to the application embedding model, it alters the vector representations of both existing metadata as well as the incoming queries. When you embed the stored metadata using an older model and then perform an embedding of the query using a newer model, you end up with unreliable retrieval similarity scores because the two vectors are not in a comparable space anymore.  

Therefore, when using an updated embedding model, the only way to ensure the reliability of retrieval operations is to re-embed all of your existing metadata collections before deploying the updated model into production. This should be treated as a deployment dependency rather than an optional maintenance activity.

How do we detect when query results are wrong due to stale metadata rather than model error?

One of the most difficult operational issues faced by schema-less querying systems is the presence of silent failures caused by stale metadata. The only feasible way to detect this occurrence is to create a structured log of the schema context provided to each call to the generate query method and utilize periodic validation queries to compare expected results from known inputs.  

If a validation query yields an unexpected result, you can determine if the absence of correlation is due only to stale metadata by comparing the schema context logged at the time the query was generated with current enriched metadata. In addition, automated detection of schema drift (where the results of the various source introspection processes are compared to the most recent enrichment) will provide an earlier warning.

We need to support both SQL and MongoDB queries from the same service. How is normalization handled?

Normalization occurs at the results level and not at the query level. Within SQL, the result set is represented as a typed row set and within MongoDB the result is an array of documents, while for REST API endpoints the result is in the form of a JSON response object. Each of these different types of sources requires adapter level normalization that transforms the source results into a common structure before being sent to any downstream components.  

The query generation layer creates source-specific syntax based on the target system, and execution takes place through the corresponding adapter. GoML's AI Matic platform handles schema-free query environments in production MongoDB, REST APIs, and Elasticsearch out of the box. Talk to our team →