Part 2: Implementing Your First RAG Solution with Progress Agentic RAG

by Adam Bertram Posted on October 16, 2025

How DataVault Connected 20 Years of Research in 48 Hours to Transform Their Knowledge Management 

Having proven the value of RAG technology with their initial knowledge box, DataVault Financial Services faced their next challenge: scaling beyond manually uploaded documents to create a comprehensive intelligence network. Their 20-year archive of research reports, real-time market feeds and compliance documentation remained scattered across systems, limiting the transformative potential they’d glimpsed. 

This is the second article in our three-part series following DataVault’s implementation of ProgressRAG-as-a-Service platform, Progress Agentic RAG. In our previous article, we saw how they solved their immediate knowledge crisis by creating their first searchable repository. Now we’ll explore how they built sophisticated data pipelines, connected multiple sources and created AI-powered systems that transformed scattered information into actionable intelligence—all while facing a critical compliance deadline. 

The Compliance Crisis 

Sarah Rodriguez stared at the email timestamp: 2:47 PM Thursday. A major client wanted a comprehensive market outlook report combining Fed policy analysis, global economic trends and regulatory updates. Due: Monday morning. 

“Our research is scattered everywhere: Dropbox folders, news feeds we never read, regulatory documents in email attachments,” she told David Kim, their senior developer. “Last time this took two weeks. We have 48 hours.” 

The Integration Challenge 

David had spent the morning after their initial Progress Agentic RAG success exploring the platform’s integration capabilities. The InvestmentInsights knowledge box was already proving valuable, but it only contained their most recent 1,000 documents—manually uploaded. 

“Look at this,” David said, pulling up both the Progress Agentic RAG dashboard and his code editor. “They have connectors for folders, RSS feeds, even web scraping. I’ve already written scripts to automate everything.” 

A screenshot of a computer

AI-generated content may be incorrect.

Progress Agentic RAG Synchronize Dashboard 

The Progress Agentic RAG Synchronize interface showing three data source options: Folder for local file systems, RSS for news feeds and Sitemap for web content 

Marcus Chen leaned over his shoulder. “How fast can you connect our historical archives?” 

Connecting the Document Archive 

DataVault’s Dropbox contained 20 years of financial history: SEC filings, Federal Reserve reports, IMF analyses and internal compliance documentation. David knew that Progress Agentic RAG’s Sync Agent could handle this volume effortlessly by monitoring the local Dropbox folder on their workstation. 

Setting Up Data Synchronization 

Within the Synchronize section, David reviewed the available data source options. The interface presented three main integration paths: Folder for local file systems, RSS for news feeds and Sitemap for web content. 

A screenshot of a computer

AI-generated content may be incorrect.

Progress Agentic RAG Synchronize Dashboard 

The Progress Agentic RAG Synchronize interface showing three data source options: Folder, RSS and Sitemap, with the Sync Agent setup panel above 

“To connect our Dropbox archive, we’ll use the Progress Agentic RAG Sync Agent with a Folder connector,” David explained to Sarah. “Since Dropbox syncs files to a local folder on our workstation, we can point the Sync Agent directly to that directory. The agent will monitor the folder for any changes and automatically sync them to Progress Agentic RAG. RSS feeds work independently - we can set those up right away through the dashboard without the Sync Agent.” 

For the Dropbox integration, David configured a Folder connector: 

A screenshot of a computer

AI-generated content may be incorrect.

Folder Connector Configuration 

The folder configuration interface for setting up synchronized sources 

  1. Download and install the Progress Agentic RAG Sync Agent on his workstation 
  2. Click on “Folder” in the Add sources section 
  3. Name the connector“ Dropbox Archive” for easy identification 
  4. Enter the local Dropbox path /Users/david/Dropbox/DataVault_Demo/ 
  5. Configure file filters to include .pdf,.docx,.txt,.md,.html,.jsonfiles 
  6. Save and sync to begin the initial synchronization 

A screenshot of a computer

AI-generated content may be incorrect.

Progress Agentic RAG Synchronizations dashboard with configured sources 

The Synchronizations dashboard showing active data sources: DataVault Research Archive and Dropbox Archive (both using Folder connectors via Sync Agent), plus RSS feeds (MarketWatch Markets, Federal Reserve News and Reuters Business News), all actively syncing 

Once configured, the Folder connector began processing their folder structure: 

From /Compliance: 

  • SEC_Form_ADV_Part2_Template.pdf- Latest compliance template 
  • SEC_Form_ADV_Instructions.pdf- Regulatory filing guidelines 
  • Risk_Disclosure_Requirements.txt- Updated disclosure requirements 

From /Market_Analysis: 

  • Fed_BeigeBook_July_2024.pdf- Regional economic conditions 
  • Fed_BeigeBook_Sept_2024.pdf- Latest Fed economic report 
  • Q3_2024_Market_Summary.md- Quarterly market analysis 
  • Investment_Committee_Minutes.html- Recent committee decisions 

From /Research: 

  • IMF_Global_Financial_Stability_Oct_2024.pdf- 11MB comprehensive report 
  • Client_Portfolio_Analysis.json- Structured portfolio data 
  • Global_Economic_Outlook.txt- Economic projections and analysis 

A screenshot of a computer

AI-generated content may be incorrect.

Documents appearing in Progress Agentic RAG Resources list 

The Progress Agentic RAG Resources list showing 144 processed documents from various sources including WSJ articles and financial advisory content 

“Look at this,” David showed Sarah as documents began appearing in the Resources list. “It’s preserving our entire folder structure. Your compliance team can find documents using the same mental model they already have.” 

Real-Time Market Intelligence 

While the historical documents synchronized, Lisa Thompson, their senior analyst, had her own request: “Can we get real-time market news flowing into this system? I need to correlate today’s events with our historical analyses.” 

“RSS feeds are even easier,” David explained, pulling up his terminal. “They don’t require the sync agent - Progress Agentic RAG can pull from them directly through the cloud. Watch this.” 

David ran his rss_feed_config.pyscript from the code_samples repository, showing Lisa the configuration in action: 

# DataVault's RSS feed configuration# File: rss_feed_config.pyrss_feeds = [ 
    { 
        "name": "MarketWatch Markets", 
        "url": "https://feeds.content.dowjones.io/public/rss/RSSMarketsMain", 
        "category": "market_analysis"    }, 
    { 
        "name": "Reuters Business News", 
        "url": "https://feeds.feedburner.com/reuters/businessNews", 
        "category": "market_news"    }, 
    { 
        "name": "Bloomberg Markets", 
        "url": "https://feeds.bloomberg.com/markets/news.rss", 
        "category": "market_news"    } 
] 

RSS feed configuration interface

RSS feed configuration interface 

Adding a new RSS feed for MarketWatch Markets with automatic 15-minute sync intervals 

As each feed was added, Progress Agentic RAG began indexing articles in real-time. Lisa watched as breaking news about Federal Reserve policy changes appeared alongside historical Fed analyses in their knowledge base. Within moments of adding the Federal Reserve RSS feed, she could see the latest announcement about “Federal Reserve Board announces final individual capital requirements for large banks” appearing in the Latest processed section. 

A screenshot of a computer

AI-generated content may be incorrect.

Federal Reserve RSS Real-time Indexing 

Real-time indexing in action - Federal Reserve announcements appearing immediately in the knowledge base alongside WSJ market articles 

“This is exactly what we needed,” Lisa said. “Now I can search for ‘interest rate changes’ and get both today’s Fed announcement and our historical analysis of similar moves from 2019.” 

Building the Q&A System 

With data flowing in from multiple sources, David implemented the next crucial component: a natural language Q&A system that could understand context across all their documents. 

Semantic Search Configuration 

David accessed the Search configuration page and enabled Progress Agentic RAG’s semantic search capabilities. “I’ve been working on this all weekend,” he told Sarah, opening his terminal: 

# Install Progress Agentic RAG SDK and required dependenciespip install Progress Agentic RAG python-dotenv pytest 

“The dependencies installed perfectly,” David continued. “Now let me show you the search implementation using the Progress Agentic RAG Python SDK.” He opened the search_financial_insights.pyscript he’d been testing: 

# search_financial_insights.pyimport os 
from dotenv importload_dotenv 
from Progress Agentic RAG import sdk 
# Load environment variablesload_dotenv() 
# DataVault's InvestmentInsights [Knowledge Box](https://docs.Progress Agentic RAG.dev/docs/management/knowledgebox) configurationPROGRESS AGENTIC RAG_API_KEY = os.environ.get('PROGRESS AGENTIC RAG_API_KEY', 'YOUR_API_KEY_HERE') 
PROGRESS AGENTIC RAG_ZONE = os.environ.get('PROGRESS AGENTIC RAG_ZONE', 'aws-us-east-2-1') 
KB_ID = os.environ.get('PROGRESS AGENTIC RAG_KB_ID', 'investmentinsights') 
def search_financial_insights(query, show_details=True): 
    """    Search across all DataVault's financial documents    with semantic understanding using Progress Agentic RAG SDK    """    # Initialize Progress Agentic RAG authentication    kb_url = f"https://{PROGRESS AGENTIC RAG_ZONE}.Progress Agentic RAG.cloud/api/v1/kb/{KB_ID}"    sdk.Progress Agentic RAGAuth().kb(url=kb_url, token=PROGRESS AGENTIC RAG_API_KEY) 
    # Create search instance    search_client = sdk.Progress Agentic RAGSearch() 
    # Perform the search using Progress Agentic RAG SDK    try: 
        response = search_client.find( 
            query=query, 
            filters=None  # Can add filters like ['/icon/application/pdf']        ) 
        # Process results from multiple sources        results = { 
            'documents': [], 
            'news': [], 
            'compliance': [] 
        } 
..... 
        returnresults 
    except Exception ase: 
        print(f"Error performing search: {str(e)}") 
        return None# Test the search with Sarah's compliance queryif __name__ == "__main__": 
    # Sarah's audit query    test_query = "risk disclosure regulatory compliance SEC requirements"    results = search_financial_insights(test_query) 

David ran his test suite to validate the implementation before Sarah’s critical demo: 

# Running the test suitepython -m pytest test_Progress Agentic RAG_search.py -v 

Pytest validation of Nuclia search API

Pytest validation of Progress Agentic RAG search API 

All 10 tests passing - API connectivity, search endpoints, filters and specific compliance term searches all validated 

“Perfect! All tests passing,” David said. “The system is ready.” 

The Audit Test 

With just 24 hours remaining before the audit, Sarah was nervous. “Show me how this works with a real compliance query,” she said. 

“I built a specific script just for your audit needs,” David replied, opening his compliance_audit_query.pyfile. “Try this: Show me all risk disclosure documentation, recent regulatory updates and any market analyses that mention systemic risk from the past quarter.” 

David executed the script he’d prepared for exactly this scenario: 

# Sarah's urgent compliance audit query# File: compliance_audit_query.pyaudit_query = "risk disclosure regulatory updates systemic risk market analysis"print(f"Executing search: {audit_query}\n") 
results = search_financial_insights(audit_query) 
# The function returned categorized results within seconds 

The terminal output showed results appearing from multiple sources: 

🔍 Query: 'risk disclosure regulatory updates systemic risk market analysis' 
📊 Total results: 23 
 
📋 Compliance Documents: 
  • Risk_Disclosure_Requirements.txt 
  • SEC_Form_ADV_Instructions.pdf 
  • Regulatory_Compliance_Checklist_2024.docx 
 
📰 Recent News: 
  • Fed Warns of Systemic Risk in Commercial Real Estate 
  • New SEC Disclosure Rules Take Effect 
  • Banking Regulators Update Risk Management Guidelines 
 
📄 Research Documents: 
  • IMF_Global_Financial_Stability_Oct_2024.pdf 
  • Q3_2024_Market_Summary.md 
  • Systemic_Risk_Assessment_Framework.pdf 

Sarah’s eyes widened as she watched the terminal output. “This would have taken me three days to compile manually. And look – it’s showing me connections I hadn’t even considered.” 

David smiled. “That’s because I configured the semantic search to understand financial synonyms and relationships. Let me show you the configuration.” He opened search_config.pyto demonstrate the optimizations. 

A screenshot of a computer

AI-generated content may be incorrect.

Inflation and market volatility correlation search 

Progress Agentic RAG revealing connections between inflation uncertainty, market volatility and regulatory requirements across Risk Disclosure documents, Q3 Market Summary and IMF Global Financial Stability reports 

Implementing Client Intelligence 

As word spread about the system’s capabilities, other departments wanted in. The wealth management team needed a way to quickly answer client questions about market conditions. 

David explored Progress Agentic RAG’s widget functionality, which allows embedding search and chat interfaces directly into websites. “We can create a client-safe search interface using Progress Agentic RAG’s pre-built widgets,” he explained. “They handle all the complexity while keeping our data secure.” The wealth management team could now provide instant insights to their clients: 

Working client portal with Federal Reserve search results

Working client portal with Federal Reserve search results 

DataVault’s custom client portal showing real-time search results for “Federal Reserve interest rates” – pulling from FOMC statements, IMF reports, Beige Books and regulatory announcements indexed in their InvestmentInsights knowledge box 

The portal demonstrated immediate value: 

  • Real-time results: Queries returned instantly from their 153 indexed documents 
  • Source transparency: Each result showed its origin for compliance tracking 
  • Semantic understanding: Natural language queries found relevant content regardless of exact phrasing 
  • Quick search suggestions: Pre-configured searches for common client questions 

The Transformation Moment 

The audit arrived Monday morning. Sarah had David’s scripts loaded on her laptop, ready to demonstrate. She pulled up the terminal alongside the Progress Agentic RAG dashboard and began her presentation to the regulators: 

“Let me show you how we’ve transformed our compliance documentation system,” she began. “David, run the Basel III query.” 

David typed into his terminal, executing a modified version of the compliance script: 

python compliance_audit_query.py --query "Basel III implementation status DataVault 2024" 

Instantly, results populated: 

  • Internal Basel III implementation roadmap 
  • Recent Federal Reserve Basel III guidance (from RSS feed) 
  • Historical compliance assessments 
  • Related risk management documents 

The lead auditor leaned forward. “How did you connect all these systems so quickly?” 

Sarah smiled, gesturing to David. “My developer built a unified intelligence network using RAG-as-a-Service. Show them the code repository, David.” 

David opened his file explorer, showing the organized structure of Python scripts, each one tested and documented. “Every document, every news feed, every analysis – it’s all connected through these scripts and searchable in natural language. The entire implementation took less than a week and the code is maintainable by our whole team.” 

Technical Insights 

For those implementing similar systems, here are key technical considerations DataVault discovered: 

Data Source Priority 

Not all sources are equal. DataVault structured their ingestion priority: 

  1. Compliance documents- Updated daily 
  2. Market news- Real-time RSS feeds 
  3. Historical analyses- Initial bulk upload, then weekly updates 

Search Strategy Optimization 

David had documented their optimal search configuration in a dedicated script: 

# Search configuration for financial data# File: search_config.pysearch_config = { 
    'semantic_weight': 0.7# Understand intent    'keyword_weight': 0.3,   # Catch specific terms    'enable_synonyms': True, # "Fed" = "Federal Reserve"    'boost_recent': True,    # Prioritize recent news    'min_confidence': 0.75   # High accuracy requirement} 

Performance Tuning 

  • Chunking strategy: 512 tokens for regulatory documents, 1024 for research reports 
  • Metadata extraction: Automatic date, source and category tagging 
  • Update frequency: RSS feeds every 15 minutes, documents every hour 

Looking Ahead 

As David closed his laptop at the end of the week, Marcus Chen, the CTO, stopped by his desk. 

“The board is impressed,” Marcus said. “They want to know if we can scale this globally. Our European acquisition needs the same system, but with multilingual support and stricter access controls.” 

David pulled up the Progress Agentic RAG pricing page showing enterprise features. “Unlimited file sizes, cloud or on-premises deployment, custom AI tasks and enterprise support with private Slack channels. We can scale this globally.” 

A screenshot of a website

AI-generated content may be incorrect.

Progress Agentic RAG Enterprise features pricing page 

Progress Agentic RAG’s Enterprise tier showing unlimited file sizes, on-premises deployment options and advanced AI capabilities 

Sarah, still glowing from the successful audit, added: “And if we can implement AI agents for automated report generation…” 

Marcus nodded. “That’s the next phase. But first, let’s document what we’ve built. This is going to transform how financial services handle information.” 

Try It Yourself: Access the Demo Files 

Want to explore the exact documents and code that powered DataVault’s transformation? All the files used in this article are available in GitHub. 

Key Takeaways 

DataVault’s implementation demonstrates three critical success factors for building a financial intelligence network: 

  1. Start with Clear Integration Priorities: Connect your most critical data sources first (compliance documents) before adding nice-to-haves. 
  2. Leverage Semantic Search Early: Don’t wait to implement natural language search – it’s what drives adoption and reveals hidden insights. 
  3. Design for Scale from Day One: Even if starting small, configure your system to handle growth in users, data sources and complexity. 

In our next article, we’ll explore how DataVault scaled their implementation globally, added multilingual capabilities and built AI agents that generate automated intelligence reports. The transformation from information repository to active intelligence platform was about to accelerate dramatically. 

Ready to build your own financial intelligence network? Start your free Progress Agentic RAG trial and follow DataVault’s proven implementation path. 

Editor's note: We'd like to thank Adam for this comprehensive guide on our newly launched RAG-as-a-Service product. Progress Agentic RAG is just at the beginning of its human-centric AI and innovation journey.  

And as with all things AI, this product will change and evolve. We will be adding new models, features, functions and extending its capabilities. As such, elements in this How-To series might change.  

If you spot areas that have been missed by this guide or if something is not factually correct, reach out to us, and we will fix it ASAP.  

With so much innovation coming, mistakes can happen. Contact us if you spot anything or if you have a suggestion of what you'd like to see next. 


Adam Bertram

Adam Bertram is a 25+ year IT veteran and an experienced online business professional. He’s a successful blogger, consultant, 6x Microsoft MVP, trainer, published author and freelance writer for dozens of publications. For how-to tech tutorials, catch up with Adam at adamtheautomator.com, connect on LinkedIn or follow him on X at @adbertram.

More from the author

Related Tags

Related Articles

Part 1: Getting Started with Progress’ RAG-as-a-Service Platform, Progress Agentic RAG
Enterprise knowledge management is broken. Critical insights get buried in email threads, brilliant analysis disappears into network drives and teams unknowingly duplicate work that was completed months earlier. The promise of AI-powered search and retrieval augmented generation (RAG) offers a solution—but how does it work in practice? Read our blog to find out.
Part 3: Advanced RAG Features and Enterprise Integration
This is the third and final article in our series following DataVault’s implementation of Progress' RAG-as-a-Service platform, Progress Agentic RAG.
Unpacking Retrieval Augmented Generation (RAG) and Generative AI
Learn how Retrieval Augmented Generation (RAG), a significant advancement in AI, offers businesses a powerful tool to enhance the accuracy, reliability and efficiency of their AI solutions.
Prefooter Dots
Subscribe Icon

Latest Stories in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation