AI has gone mainstream, but most enterprises are still struggling with the document data governance needed to scale it, according to a new study by Apryse.

"AI is no longer experimental, it's operational," said Andrew Varley, CPO, Apryse. "But enterprises are discovering that the infrastructure behind it, especially around document data quality, hasn't evolved fast enough. Surging data growth without governance, a lack of visibility into what content already exists, and fragmented tooling are now the biggest barriers to intelligent processing at scale."
Data is struggling to keep up
The survey of 465 organisations across North America, Europe, Australia and New Zealand has revealed the following:
- 64.5% of organisations already run AI in production, to boost operational efficiency (63%), enhance customer experience (51%), and support data-driven decisions (41%).
- 76.6% store between 25–75% of their data in documents, but only 38.1% rate that data as "excellent" for AI use.
- 67.3% believe it is "extremely important" to keep document processing in-house
- The top barriers to scaling AI are data security concerns (54%) and data quality (49%) (49%).
- 82.8% plan to invest in document automation within the next 12 months. However, almost half lack confidence in their current pipelines.
- 62.8% experience document quality issues "occasionally" or "frequently."
The need for structured document data
According to the respondents, the most critical capabilities in document automation are the following:
- Table/form recognition (59.6%): helps better understand layout and relationships in documents such as invoices, contracts, and forms.
- Developer-friendly SDKs: lower technical barriers to automated document workflows.
- Metadata tagging enables context-aware data classification, facilitating compliance, searchability, and governance to reduce risk and improve AI accuracy.
