Back to news
Use CaseMay 12, 2026· 2 min read

JBS Dev says start AI projects with messy data, skip vendor prep

Technology provider argues LLMs handle poor quality data better than vendors claim, citing medical billing case with mixed PDF and image records.

By Agentic DailyVerified Source: AI News

Our Take

Rose correctly identifies vendor upsell tactics but offers no benchmarks on accuracy degradation with messy data or cost comparisons for his DIY approach.

Why it matters

Enterprises are delaying AI deployments waiting for perfect data pipelines when current models may already handle their existing datasets adequately.

Do this week

Data teams: Run a pilot on your messiest dataset this week before investing in data transformation to baseline what current LLMs can actually handle.

JBS Dev challenges perfect data requirements

Joe Rose, president at JBS Dev, argues enterprises don't need perfect data before deploying generative AI systems. "It's a common misconception that your data has to be perfect before you do any of these types of workloads," Rose said in an AI News interview.

Rose cites a medical billing client case where records mixed PDFs and images with inconsistent field placement. The AI system handled OCR, text extraction, and record validation from simple prompts, then used agentic approaches to compare customer records against insurance contracts for billing verification. "You start to layer different use cases on top of one another," Rose explained, describing automation rates growing from 20% to 80% over time.

Rose also recommends enterprises stop buying SaaS AI tools and build directly on cloud platforms instead. "Almost everybody's got some kind of cloud presence, and that's where I would start, because the cloud tooling, especially for the big three... has everything you need to start implementing agentic workloads tomorrow, without new software licenses and new training," he said.

Cost sustainability becomes the next bottleneck

Rose predicts the focus will shift from model capability improvements to cost and portability. "I think you're going to see a shift away from these radical leaps and model capability, and more shift towards 'how do we make the cost more sustainable that we don't have to build data centres at the rate we're building data centres?'"

He expects the "last mile" challenge will be running models on laptops and phones rather than data centers, given that models have already consumed most available training data. "The models were trained on a body of data - essentially every page on the internet and other stuff. It's not like there's a tonne more data that hasn't already been put into them that's going to lead to some type of breakthrough."

Human-in-loop remains mandatory

Rose emphasizes that messy data approaches still require human oversight. "That's not to say that it gets everything right - you still need a human in the loop," he noted. The inherent unpredictability of models means enterprises must design for handling bad outputs rather than expecting deterministic results.

For enterprises considering the DIY approach, Rose suggests starting with existing cloud infrastructure rather than purchasing new software licenses. However, he provides no specific cost comparisons between cloud-native development and SaaS alternatives, nor accuracy benchmarks for the medical billing case study.

#LLM#Enterprise AI#Agents#Developer Tools
Share:
Keep reading

Related stories