Innovation Strategy

Drive growth through user-centric innovation by conceptualizing, developing, and optimizing digital solutions.

Develop holistic, omnichannel, customer experiences that optimize touchpoints, boost satisfaction, and enhance loyalty.

Conduct in-depth user research to reveal market opportunities and incorporate user preferences and behavioral insights to guide digital solution development.

Emerging TechnologyExploration

Thoughtfully explore the application of emerging technologies to enable a new generation of intelligent digital solutions.

Innovation Strategy

Digital Product & Platform Strategy

Drive growth through user-centric innovation by conceptualizing, developing, and optimizing digital solutions.

Customer Experience Strategy

Develop holistic, omnichannel, customer experiences that optimize touchpoints, boost satisfaction, and enhance loyalty.

User Research
& Analysis

Conduct in-depth user research to reveal market opportunities and incorporate user preferences and behavioral insights to guide digital solution development.

Emerging Technology Exploration

Thoughtfully explore the application of emerging technologies to enable a new generation of intelligent digital solutions.

Experience Design

Digital Product Design

Design immersive, user-centric digital products that drive growth by leveraging our experience design and product strategy capabilities.

Technology Platform Design

Optimize the performance of your digital platform interface and architecture by ensuring it adapts and scales with advanced platform design.

User Experience Design

Elevate user adoption, retention, and loyalty by making every touchpoint users have with your digital product or platform frictionless.

Technical Strategy & Architecture

Technology Engineering & 
Enablement

Engineer efficient, scalable digital solutions through a well-defined technology strategy enabled by thoughtful technical architecture.

Technology Due
Diligence

Improve operational efficiency and mitigate technical risk by objectively analyzing and assessing your technology assets.

Enterprise Modernization

Accelerate your integration of modern technologies to streamline operations, increase business agility, and reduce technical debt.

Data, Analytics & AI

Utilize advanced techniques that transform data into actionable intelligence to effectively compete and outperform in your domain.

Emerging Technology Applications

Innovate ahead of your market using emerging technologies to develop solutions that optimize your operations and elevate your customer experience.

Hardware & Embedded Solutions

Forward-thinking software and hardware engineering to reimagine your digital solutions and build the right products faster.

Product & Platform Engineering

Digital Product Development

Bring your product vision to life from concept to launch with user-centered experience design and world-class digital engineering.

Product Due Diligence

Identify potential gaps in your product development lifecycle to establish a solid foundation for scalable, value-driven digital products and growth.

Digital Product
Evolution

Prioritize continuous digital product improvement with comprehensive maintenance, performance optimization, and feature enhancements.

Embedding Emerging
Technologies

Embed emerging technologies into your digital products to boost performance, enhance user experiences, and unlock new functionality.

Centers of Excellence

Center of Excellence teams at HTEC stitch together recognized expertise across the firm to accelerate innovation, research, and efficiency in digital solution design, development, and engineering.

Engineering & Delivery

E&D drives engineering performance and efficiency for clients at any stage of their digital journey deploying the right expertise at the right time in the right context.

Tech Excellence Office

TEO provides expertise in technology excellence to build innovative solutions and support internal teams and clients with cutting-edge technologies.

Product & Design

P&D empowers HTEC teams and clients with best practices, strategies, and insights for product design and development.

Life at HTEC

Benefits

Craft customer-centric solutions and drive business success by leveraging our experience, strategy, and design services.

Global Locations

Explore HTEC’s global presence, from our headquarters to consultancy and development centers. Discover the diverse local flavors of each location.

Teams

Learn about our diverse, global teams and how our structure supports excellence in engineering, delivery, and business operations.

Culture

Dive into HTEC’s culture, where innovation, collaboration, and growth drive everything we do. Explore our values and what makes HTEC a great place to work.

htec.ai

Let’s partner up

Home

Insights & Events

Digital twins

Exploring RAG systems: GraphRAG, Speculative RAG, and RAG-Fusion

2025/02/11

Contributing experts

Marko Kekic

This post is part three of a series about the use of retrieval-augmented generation (RAG) in AI models. Part one discussed how RAG streamlines knowledge management systems to improve access to accurate information and part two explored RAG’s pivotal role in AI-powered customer support.

As more businesses engage with customers and employees using AI-powered chatbots and virtual assistants, the pressure is on to deliver the fastest and most accurate responses possible. Recent survey data from Statista revealed that 82% of consumers prefer using chatbots over waiting for a human customer service agent. In other words, customers prioritize immediate assistance over human interaction.

To deliver on these changing customer expectations, organizations are increasingly supplementing their chatbots with retrieval-augmented generation (RAG).

Unlike traditional AI systems that rely on pre-trained and potentially outdated data, RAG excels at mining authoritative knowledge bases outside the AI model’s training data to generate more accurate and up-to-date responses faster.

But RAG is not a one-size-fits-all solution. As its capabilities expand, new variations have emerged to address specific use cases. Among these, GraphRAG, Speculative RAG, and RAG-Fusion stand out for their unique features.

In this blog post, we’ll examine the strengths, challenges, and applications of these three RAG systems, and discuss how each is defining the future of AI-powered interactions.

GraphRAG: The power of structured knowledge

GraphRAG is renowned for its ability to understand the causal relationships within data.

In contrast to traditional RAG systems that treat information as isolated blocks of text, GraphRAG uses structured knowledge graphs to organize data into vectors that capture relationships between entities. These knowledge graphs help generate responses that factor in how people, places, products, concepts are interconnected.

GraphRAG’s main strength

Connecting the dots: By understanding the relationships within information, GraphRAG helps produce more accurate and contextual responses.

Consider a healthcare scenario where a user asks about the link between obesity, hypertension, and diabetes. A traditional RAG system treats the three topics as unrelated pieces of information and would describe them separately without explaining how they are related.

Because GraphRAG uses structured knowledge graphs to organize data into entities and map relationships, it helps generate a “cause and effect” explanation of how obesity contributes to hypertension which increases the risk of diabetes. The result is a more comprehensive answer.

Challenges facing GraphRAG

Reliance on structured data: GraphRAG’s accuracy depends on its ability to access structured knowledge graphs, but many real-world data sources — text documents, multimedia files, emails — are unstructured (lacking a standardized format). When data can’t be linked to the graph, responses may be incomplete and inaccurate.

Potential solution: Experts can leverage techniques such as text embedding (converting text into numbers) or use large language models (LLMs) and natural language processing (NLP) to link unstructured data to structured graphs.

High maintenance requirements: Creating and maintaining knowledge graphs requires expertise from data scientists and IT teams. The maintenance involved can slow implementation and make it difficult to scale knowledge graphs.

Potential solution: To avoid having a GraphRAG produce outdated information, teams should schedule regular data refreshes, set up monitoring systems to detect data inaccuracies, and keep humans in the loop to validate updates.

Want to learn more about assessing your data’s maturity? Download our white paper to discover how data assessments can help improve your AI-powered solutions.

Common use cases for GraphRAG

GraphRAG’s contextual understanding makes it an excellent fit for several industries. In healthcare, GraphRAG identifies connections between symptoms, conditions, and treatments, aiding in diagnoses and personalized care. For legal research, GraphRAG examines case law to pinpoint how a legal precedent applies to an attorney’s case. In finance, GraphRAG is used to analyze transactions for fraud detection and regulatory violations.

Speculative RAG: Balancing speed and accuracy

Users expect near-instant responses from chatbots, and Speculative RAG delivers speed without sacrificing accuracy.

Specifically, Speculative RAG uses a specialist language model (LM) called a “drafter” to generate an immediate answer. Meanwhile, a larger generalist LM called a “verifier” retrieves more detailed information. Once it gathers enough evidence, the verifier compares the drafter’s initial response with the retrieved data and refines the response as needed.

Traditional RAG systems, on the other hand, retrieve information first and then generate responses. This sequential approach is ideal for producing thorough answers. However, the sequencing happens at a deliberate pace and can be frustrating for users when speed is critical.

According to Google’s analysis of the PubHealth dataset, Speculative RAG achieves a 51% reduction in response time compared to traditional RAG systems.

Speculative RAG’s main strength

Less latency, more speed and accuracy: By offering responses instantly, Speculative RAG minimizes wait times. In addition, Speculative RAG’S ability to refine initial responses ensures that users get fast answers without compromising accuracy.

In an ecommerce scenario, Speculative RAG could recommend products in real-time based on initial data and update the recommendation as the system retrieves more information about the user’s preferences.

Challenges facing Speculative RAG

Implementation complexity: Running drafter models and verifier models simultaneously — and making sure they stay in sync — can be challenging, especially for high-traffic applications.

Potential solution: Use lightweight models for the initial speculative response and use cloud infrastructure that scales resources based on demand to avoid over-provisioning during low-traffic periods.

Training overhead: Training a drafter can be a time-consuming extra deployment step that requires GPU resources to ensure the drafter predicts responses quickly and coherently.

Potential solution: Instead of training a drafter from scratch, organizations can fine-tune an existing smaller LLM that balances speed and accuracy or create a lightweight model that mimics a larger LLM’s responses.

Common use cases for Speculative RAG

Speculative RAG excels at real-time customer support in industries like telecom and ecommerce where customers need quick information (about an internet outage or a late order delivery) while also receiving updates as they become available.

For instance, during high-traffic ecommerce flash sales (Black Friday, Cyber Monday, holidays), Speculative RAG’s drafter model provides instant customer support answers while the verifier model retrieves more detailed information. This keeps shoppers engaged instead of abandoning purchases due to delays.

*It’s worth noting that Speculative RAG is designed for scenarios that depend on speed and is not ideal for industries like healthcare or finance in which accuracy outweighs the need for instant responses.

RAG-Fusion: An aggregation machine

RAG-Fusion systems take a more advanced path to retrieval and generation than traditional RAG systems by breaking down an original user query into more specific sub-queries that pull from different sources like databases, research reports, and videos.

Then, using a technique called reciprocal rank fusion (RRF), RAG-Fusion assigns scores to the sub-queries and ranks them based on relevance to the user’s original query. The re-ranked information is then aggregated — or “fused” — to create a comprehensive and accurate response.

RAG-Fusion’s main strength

More well-rounded answers, less AI hallucinations: Because it generates sub-queries and aggregates data from each, RAG-Fusion develops a deeper understanding of the user’s query, ensuring responses are insightful and well-informed. For example, in ecommerce, RAG-Fusion combines customer reviews, inventory data, and multimedia content to recommend products.

Further, by collecting diverse and verified information, RAG-Fusion minimizes the risk of AI hallucinations (fabricated answers based on patterns AI models learn from training data). This makes RAG-Fusion more reliable in high-stakes environments like healthcare and finance.

Discover essential strategies for preventing AI hallucinations. Download our white paper, “When machines dream: Overcoming the challenges of AI hallucinations” and learn how to build customer trust with reliable AI outputs.

Challenges facing RAG-Fusion

Data integration: Aggregating data for a RAG-Fusion system can be technically challenging due to differences in data quality, data formats, and scalability (more data sources equal larger volumes of data).

Potential solution: Implement monitoring systems that can flag outdated, duplicate, or incomplete data for review before integration.

Clear and consistent responses: When aggregating information from different sources, there’s a risk of generating unclear, inconsistent, or overly complex responses.

Potential solution: Develop response generation algorithms that filter out conflicting or redundant information. While these algorithms are often pre-built into RAG-Fusion systems, developers also have the option to customize them.

Common use cases for RAG-Fusion

Ecommerce recommendations: RAG-Fusion can be used to personalize product recommendations by merging structured data (inventory), unstructured data (customer reviews), and multimedia (product images or videos).

Healthcare support: Industry professionals can use RAG-Fusion to integrate structured data (patient records) with unstructured text (research papers) and imaging information (X-rays) to provide more comprehensive healthcare support.

The cost factor for RAG systems

The RAG systems discussed in this post offer transformative benefits. However, they come with a higher price tag than traditional RAG systems due to the computational costs for data processing and the resources required to collect data and build and maintain knowledge graphs (for GraphRAG).

The right RAG system for you depends on your company’s data management skills, industry-specific applications, organizational goals, and budget. Despite the extra expense and effort, however, companies can expect to see an average return of $3.50 for every dollar invested in AI, based on a study by IDC.

RAG is redefining enterprise AI

GraphRAG, Speculative RAG, and RAG-Fusion showcase the versatility of RAG systems. Each system is designed for specific challenges and industries and has significant benefits compared to traditional RAG.

The combined potential of these systems highlights how RAG is redefining AI with smarter, faster, and more accurate interactions.

For its part, HTEC has created its own highly configurable RAG pipeline for its AI system that can scale from basic to advanced RAG. HTEC is excited to pass its knowledge and expertise on to enterprises looking to enhance their AI systems with the latest RAG technology.

Ready to discover how HTEC’s AI and data science expertise can support your business strategy? Connect with an HTEC expert.

Technology