Gemini産DeepResearchのオープンソースをAzure環境で再構築！社内文書のディープリサーチを作る手順

2025.07.01

1. はじめに
2. なぜAzure AI Searchなのか
3. 移植における修正ポイント
4. 移植における修正ポイント
5. 検索結果
6. まとめ

1. はじめに

MS開発部の松坂です。今回は前回の記事の続編として、Googleが公開したオープンソース「DeepResearch」をベースに、Azure OpenAI環境へ移植し、社内ドキュメント検索エージェントとして再構成する方法をご紹介します。

Gemini産DeepResearchのオープンソースをAzure OpenAIに置き換える方法 – 株式会社ディープコム GoogleのGitHub上で公開されている google-gemini/gemini-fullstack-langgraph-quickstart は、LangGraphとGemini 2.5を活用した多段構成のフルスタックエージェントのサンプルです。本記事では、この構成の中核となっているWeb検索機能を、Azure AI Searchによるドキュメント検索へと置き換える具体的な方法について解説します。

google-gemini/gemini-fullstack-langgraph-quickstart: Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

注意：本記事で紹介する「DeepResearch」は、GoogleがGitHub上で公開している google-gemini/gemini-fullstack-langgraph-quickstart プロジェクトを基にしています。一方、「DeepResearch」はOpenAIでも同名の機能として存在するため、混同にご注意ください。本記事ではGoogleのGitHubプロジェクトに基づき、Azure OpenAIへ移植する手順を解説しています。

2. なぜAzure AI Searchなのか

Azure AI Search（旧称 Azure Cognitive Search）は、Azure Storage や SQL Database などからデータを取り込み、インデックスを生成することで、全文検索やベクター検索を実現できる検索プラットフォームです。

セキュリティと管理性の向上内部ドキュメントを安全に扱えるため、企業利用に適しています。
既存Azureサービスとの連携 Azure StorageやAzure Functionsなどと連携しやすく、拡張性が高いです。
コスト最適化 Azureの柔軟な料金体系とリソース管理で、運用コストを最適化できます。

3. 移植における修正ポイント

本記事では、Web検索をAzureAISearchと連結したエージェントを構築するまでに必要な修正箇所を要点に絞って解説します。対象：gemini-fullstack-langgraph-quickstart-main/backend/src/agent 配下のPythonコード元の構成では、Google SearchやBingを利用したWeb検索エージェントが組み込まれており、ユーザーの質問に対して外部情報を取得する仕組みとなっています。今回、この検索処理をAzure AI Searchに置き換えることで、社内のOneDrive、SharePoint、ローカル文書データなどに対して、直接的かつ安全なリサーチを可能にします。

4. 移植における修正ポイント

4-1. 事前準備：Azure AI Foundry上に多段エージェントを構築

Azure AI Foundryには、新機能として「接続されたエージェント（Connected Agents）」が登場しています。本構成では、この仕組みを用いて、Azure AI Searchを組み込んだ多段エージェントを事前に構築してください。 Azure AI Foundryに新機能「接続されたエージェント」登場 – 株式会社ディープコム

4-2. プロンプトの作成

次に、Azure AI Searchによる検索のためのプロンプトを用意します。ここでは社内ドキュメントに特化した形で、以下のような構成にしています。

doc_search_instructions ="""Conduct a comprehensive search across organizational OneDrive, SharePoint, or integrated document repositories to retrieve the most recent and relevant documents related to "{research_topic}". The current date is {current_date}.

Instructions:
- Search should prioritize documents updated or created in the last 6–12 months unless otherwise relevant.
- Include Word, PowerPoint, Excel, PDF, and other commonly used formats.
- Extract verifiable facts, figures, strategies, and expert insights from the documents.
- For each piece of information, clearly cite the document name, author (if available), and last modified date.
- Compile the results into a coherent summary or report.
- Do not infer or assume any information that is not explicitly stated in the source documents.
- Avoid duplication and cross-reference where similar data appears in multiple sources.

Research Topic:
{research_topic}
"""

doc_search_instructions ="""Conduct a comprehensive search across organizational OneDrive, SharePoint, or integrated document repositories to retrieve the most recent and relevant documents related to "{research_topic}". The current date is {current_date}.

Instructions:

- Search should prioritize documents updated or created in the last 6–12 months unless otherwise relevant.

- Include Word, PowerPoint, Excel, PDF, and other commonly used formats.

- Extract verifiable facts, figures, strategies, and expert insights from the documents.

- For each piece of information, clearly cite the document name, author (if available), and last modified date.

- Compile the results into a coherent summary or report.

- Do not infer or assume any information that is not explicitly stated in the source documents.

- Avoid duplication and cross-reference where similar data appears in multiple sources.

Research Topic:

{research_topic}

"""

4-3. graph.py の修正

web_researchの関数内で呼び出していたプロンプト部分を以下のように修正します

formatted_prompt = doc_search_instructions.format(
        current_date=get_current_date(),
        research_topic=state["search_query"],
    )

formatted_prompt = doc_search_instructions.format(

current_date=get_current_date(),

research_topic=state["search_query"],

)

エージェントIDや呼び出しロジック自体は前回の構成をそのまま流用可能です。上記のようにプロンプト内容を差し替えることで、Azure AI Search による社内ドキュメント検索エージェントとして動作します。

5. 検索結果

以下記事にございます、データセットを利用してAzureAISearchのインデクサーを作成し、多段構成のAzureAIAgentを用いて検索を行いました。 Azure Cognitive Search 全文検索の新機能セマンティックアンサー・キャプションを表示するデモ #AI – Qiita

実行結果では、ドキュメントに関連する情報が要約されて返ってきました。ただし、Web検索とは異なりURLリンクのような明示的な出典が含まれない点には注意が必要です。この点は、プロンプト設計やLangChain周辺のカスタマイズによって柔軟に調整可能です。

6. まとめ

本記事では、Google GeminiプロジェクトのDeepResearch構成を、Azure AI Searchを用いた社内向けドキュメント検索エージェントとして移植する方法をご紹介しました。外部Web検索に依存せず、社内ナレッジを活用した高度なディープリサーチを行いたい企業や組織にとって、非常に実用的なアプローチとなります。セキュアで管理性の高い環境で、AI活用の幅を広げたい方はぜひお試しください。

以上、最後までご愛読いただき
ありがとうございました。

お問い合わせは、
以下のフォームへご連絡ください。

お問い合わせ