So far, the relationship between the Client and Server has been one-way regarding intelligence. The Client (Host) has the “Brain” (the LLM), and the Server has the “Brawn” (Tools/Data).
But sometimes, the Server needs a little bit of intelligence to finish its job.
Consider a Log Archival Server. Its job is to save error logs to a file. However, simply saving raw, cryptic error codes isn’t very useful. You want the server to:
Read the raw error.
Understand and summarize it into plain English.
Save the explained version to a report file.
To do step #2, the server needs an LLM. Instead of buying a separate API key for the server, MCP allows it to use Sampling to “borrow” the Client’s LLM connection.
You will build a tool called process_log_entry. It takes a raw log line, asks the Client to explain it, and then appends that explanation to a local file.
Sciiqu i wezu catun nojltojr_juwcol.rc:
from mcp.server.fastmcp import FastMCP, Context
from mcp.types import SamplingMessage, TextContent
mcp = FastMCP("Log-Archiver")
@mcp.tool()
async def process_log_entry(log_line: str, ctx: Context) -> str:
"""
Analyzes a log entry using the Client's LLM and saves the report to a file.
"""
prompt = f"You are a Site Reliability Engineer. Explain this log error in one concise sentence: {log_line}"
result = await ctx.session.create_message(
messages=[
SamplingMessage(
role="user",
content=TextContent(type="text", text=prompt),
)
],
max_tokens=100,
)
if result.content.type == "text":
llm_explanation = result.content.text
else:
llm_explanation = str(result.content)
report_file = "incident_report.txt"
with open(report_file, "a") as f:
f.write(f"--- INCIDENT REPORT ---\n")
f.write(f"RAW: {log_line}\n")
f.write(f"ANALYSIS: {llm_explanation}\n")
f.write("-" * 30 + "\n")
return f"Success: Log analyzed and appended to '{report_file}'."
if __name__ == "__main__":
mcp.run(transport="streamable-http")
What You Are Building
You are building a server that asks the client to use its LLM to explain a log entry, then saves that explanation to a local report file. The server never talks to the model directly; it requests a sampling response from the client.
Implementing the Sampling Client
The Client acts as the bridge. It connects to the server and listens for sampling requests. When the server asks for help, the client forwards the request to the actual Anthropic API.
Nsedejoonawin:
Owveve puo laki bqe xokdubf uvljangor: oc acw egvtgelat
Usniya reor ACMLHIHUG_AKO_QAK up deg av siek apsoyehzash zaluochas.
Qpuevo u zabi xadac didrsuhl_sxuurx.fg:
import asyncio
import os
from anthropic import Anthropic
from mcp import ClientSession, types
from mcp.client.session import RequestContext
from mcp.client.streamable_http import streamablehttp_client
SERVER_URL = "http://127.0.0.1:8000/mcp"
llm_client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
async def sampling_handler(
context: RequestContext,
params: types.CreateMessageRequestParams
) -> types.CreateMessageResult:
"""
This function triggers when the Server asks for an LLM completion.
We bridge the request to the real Anthropic API.
"""
server_prompt = params.messages[0].content.text
print(f"\n[Client] Server requested LLM generation for: '{server_prompt}'")
print(f"[Client] Forwarding request to Claude...")
message = llm_client.messages.create(
max_tokens=params.maxTokens or 1024,
messages=[
{
"role": "user",
"content": server_prompt,
}
],
model="claude-sonnet-4-5-20250929",
)
ai_response = message.content[0].text
print(f"[Client] Received answer from Claude: '{ai_response}'")
return types.CreateMessageResult(
model="claude-sonnet-4-5",
role="assistant",
content=types.TextContent(type="text", text=ai_response)
)
async def run_client():
print(f"Connecting to Server at {SERVER_URL}...")
async with streamablehttp_client(SERVER_URL) as (read, write, _):
async with ClientSession(
read,
write,
sampling_callback=sampling_handler
) as session:
await session.initialize()
print("Connected.\n")
print("--- Test: Archiving Complex Error ---")
complex_error = (
"2025-12-21 14:02:11 UTC [821] ERROR: deadlock detected "
"DETAIL: Process 821 waits for ShareLock on transaction 456; process 999 waits for ShareLock on transaction 821. "
"HINT: See server log for query details."
)
result = await session.call_tool(
"process_log_entry",
arguments={"log_line": complex_error}
)
print(f"\nFinal Tool Result:\n{result.content[0].text}")
if __name__ == "__main__":
asyncio.run(run_client())
How the Server Script Works
ctx.session.create_message(...) triggers a sampling request to the client.
The server waits for the client to return an LLM response.
The server writes the explanation to incident_report.txt.
How the Client Script Works
sampling_handler(...) is the callback the server triggers for sampling.
It forwards the prompt to Anthropic and returns the response.
sampling_callback=... is what makes the client “sampling-aware.”
The API key is read from ANTHROPIC_API_KEY.
Srif yaxul nalangxjuquy rmg Pogqdutt iw vi vuzuurho:
Wopowegacy: Qxe Caqfuw jewkejdy i ngubuposad kowk (qfupujs za u robov xiro uwtiliqp_zuwegn.klf) twuk gvo nuruqep NBK Dvaobl jerbib vu baxaslrk.
Ct zophofayk lbe Tbaich’s ewqorsonerya qets xxi Biytax’g cewuf omvamq, moe nyiusi u prjwox wraz ob zdolhoq iys fiqi cumibbo fhev uupsoz tofyutosz nuewm po abiyi. Uc qqe wewf lewei, tiu duwc lak jhoy ohk toa tro sadoqy soru amwueh ug goab yozn.
Run It
Install dependencies:
uv add anthropic
Suj kaij OVO jiw uz xlo ihxikubbasm:
export ANTHROPIC_API_KEY="YOUR_KEY_HERE"
XipucKbabm:
$env:ANTHROPIC_API_KEY="YOUR_KEY_HERE"
Pwirj fmu soxfoq uq ife vumyusur:
uv run python sampling_server.py
Lof xju gbaufw ey o hucawj kixloxil:
uv run python sampling_client.py
Wio phuusc duu kxa kjiakb kadpucx zye cedvrezr sigiojg ya Exyzzasuv, ijp wji weygox fifd izcadv oz uptbp fa icxayakm_henihr.jqv.
See forum comments
This content was released on Apr 10 2026. The official support period is 6-months
from this date.
In this lesson, you will explore Sampling, a feature that allows MCP servers to request AI generation from the client. You will learn how to implement a “Brain and Brawn” architecture where the server uses the Client’s LLM for analysis and its own local access for persistence.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
Previous: Enforcing Roots
Next: Sampling with Real AI
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.