What is SIMOSphere AI?

SIMOSphere AI is a European AI orchestration platform that connects CRM, ERP, and documents with large language models. It is GDPR-compliant, EU-AI-Act-ready, and Made in Germany.

Is SIMOSphere AI GDPR-compliant?

Yes. All data is stored in Germany (Hetzner), PII is automatically redacted before external model calls, and per-tenant encryption keys ensure data isolation. SIMOSphere AI is compliant with GDPR Art. 5, 6, 13, 17, 25, 28, 30, 32, and 35.

Which AI models does SIMOSphere AI support?

SIMOSphere AI supports Mistral (EU-hosted), OpenAI (GPT-4o, GPT-4o-mini), Anthropic (Claude), and self-hosted models. The platform routes requests to the optimal model for each task through a single, OpenAI-compatible API.

Can SIMOSphere AI be deployed on-premise?

Yes. The Enterprise plan includes an on-premise deployment option with dedicated tenant isolation, custom SSO integration (SAML, OIDC), and a 99.9% uptime SLA.

Is there a free trial?

Yes. All plans include a 14-day free trial with no credit card required. Register at onboarding.simosphereai.com to get started.

What is the Model Context Protocol (MCP) server?

The MCP server connects SIMOSphere AI to your existing business systems. It provides four built-in tools: search_documents (SharePoint, DMS), query_database (D365, CRM), create_record, and send_notification (email, Teams, Slack). Available from the Professional plan.

Recipes

Code Examples

Production-ready code snippets for common integration patterns. Every example uses the SIMOSphere AI gateway at api.simosphereai.com and works with any OpenAI-compatible client. Copy, paste, and adapt these to get started quickly.

Basic Chat Completion

The simplest integration: send a prompt and receive a completion. The API follows the OpenAI chat completions format with a messages array containing role and content pairs.

curl

curl -X POST https://api.simosphereai.com/v1/chat/completions \
  -H "Authorization: Bearer $SIMOSPHERE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-small-latest",
    "messages": [
      {"role": "system", "content": "You are a GDPR compliance advisor."},
      {"role": "user", "content": "What are the key requirements of Article 30?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SIMOSPHERE_API_KEY,
  baseURL: "https://api.simosphereai.com/v1",
});

const completion = await client.chat.completions.create({
  model: "mistral-small-latest",
  messages: [
    { role: "system", content: "You are a GDPR compliance advisor." },
    { role: "user", content: "What are the key requirements of Article 30?" },
  ],
  temperature: 0.7,
  max_tokens: 1024,
});

console.log(completion.choices[0].message.content);
console.log(`Tokens: ${completion.usage.total_tokens}`);

Python

from openai import OpenAI

client = OpenAI(
    api_key="sk_live_YOUR_API_KEY",
    base_url="https://api.simosphereai.com/v1",
)

response = client.chat.completions.create(
    model="mistral-small-latest",
    messages=[
        {"role": "system", "content": "You are a GDPR compliance advisor."},
        {"role": "user", "content": "What are the key requirements of Article 30?"},
    ],
    temperature=0.7,
    max_tokens=1024,
)

print(response.choices[0].message.content)
print(f"Tokens: {response.usage.total_tokens}")

Streaming Responses

Streaming delivers tokens as they are generated, reducing time-to-first-token and enabling real-time display in chat interfaces. Set stream: true to receive Server-Sent Events (SSE) instead of waiting for the complete response.

curl (SSE)

curl -N -X POST https://api.simosphereai.com/v1/chat/completions \
  -H "Authorization: Bearer $SIMOSPHERE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-small-latest",
    "messages": [{"role": "user", "content": "Write a poem about data sovereignty."}],
    "stream": true
  }'

Node.js (Async Iterator)

const stream = await client.chat.completions.create({
  model: "mistral-small-latest",
  messages: [{ role: "user", content: "Write a poem about data sovereignty." }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
console.log(); // trailing newline

Python (Generator)

stream = client.chat.completions.create(
    model="mistral-small-latest",
    messages=[{"role": "user", "content": "Write a poem about data sovereignty."}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
print()  # trailing newline

Function Calling (Tool Use)

Function calling allows the model to request structured actions from your application. Define tools as JSON Schema and the model will return a tool call with the appropriate arguments when it determines a function should be invoked.

Function Calling

const completion = await client.chat.completions.create({
  model: "mistral-small-latest",
  messages: [
    { role: "user", content: "What is the current weather in Aschaffenburg?" },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get the current weather for a city.",
        parameters: {
          type: "object",
          properties: {
            city: { type: "string", description: "City name" },
            units: {
              type: "string",
              enum: ["celsius", "fahrenheit"],
              description: "Temperature units",
            },
          },
          required: ["city"],
        },
      },
    },
  ],
  tool_choice: "auto",
});

const toolCall = completion.choices[0].message.tool_calls?.[0];
if (toolCall) {
  console.log("Function:", toolCall.function.name);
  console.log("Arguments:", toolCall.function.arguments);
  // => Function: get_weather
  // => Arguments: {"city":"Aschaffenburg","units":"celsius"}

  // Execute the function and send the result back:
  const weatherResult = await getWeather(JSON.parse(toolCall.function.arguments));

  const followUp = await client.chat.completions.create({
    model: "mistral-small-latest",
    messages: [
      { role: "user", content: "What is the current weather in Aschaffenburg?" },
      completion.choices[0].message,
      {
        role: "tool",
        tool_call_id: toolCall.id,
        content: JSON.stringify(weatherResult),
      },
    ],
  });

  console.log(followUp.choices[0].message.content);
}

Multi-Turn Conversation

Build conversational applications by maintaining a message history. Each request includes the full conversation context so the model can generate contextually relevant responses. Manage the conversation length to stay within the model's context window.

Multi-Turn Conversation

const messages = [
  { role: "system", content: "You are a helpful EU regulation expert." },
];

async function chat(userMessage: string) {
  messages.push({ role: "user", content: userMessage });

  const completion = await client.chat.completions.create({
    model: "mistral-small-latest",
    messages,
    temperature: 0.7,
  });

  const reply = completion.choices[0].message;
  messages.push(reply);

  return reply.content;
}

// Usage:
await chat("What is the EU AI Act?");
await chat("Which risk categories does it define?");
await chat("How does it affect open-source models?");

List and Select Models

Query the models endpoint to discover which models are available on your plan. The response includes model IDs, context windows, and capabilities — use this to dynamically select the best model for each task.

curl

curl https://api.simosphereai.com/v1/models \
  -H "Authorization: Bearer $SIMOSPHERE_API_KEY"

Node.js

const models = await client.models.list();

for (const model of models.data) {
  console.log(`${model.id} — context: ${model.context_window ?? "N/A"}`);
}

Python

models = client.models.list()

for model in models.data:
    print(f"{model.id} — created: {model.created}")

Error Handling

The API returns standard HTTP status codes. Implement proper error handling with retries for transient failures. The Retry-After header is present on 429 responses and indicates how many seconds to wait before retrying.

Error Handling with Retry

import OpenAI from "openai";

async function robustCompletion(prompt: string, retries = 3) {
  for (let attempt = 1; attempt <= retries; attempt++) {
    try {
      return await client.chat.completions.create({
        model: "mistral-small-latest",
        messages: [{ role: "user", content: prompt }],
      });
    } catch (error) {
      if (error instanceof OpenAI.RateLimitError && attempt < retries) {
        const retryAfter = Number(error.headers?.["retry-after"]) || 5;
        console.warn(`Rate limited. Retrying in ${retryAfter}s...`);
        await new Promise((r) => setTimeout(r, retryAfter * 1000));
        continue;
      }
      if (error instanceof OpenAI.APIError && error.status >= 500 && attempt < retries) {
        const backoff = Math.pow(2, attempt) * 1000;
        console.warn(`Server error. Retrying in ${backoff}ms...`);
        await new Promise((r) => setTimeout(r, backoff));
        continue;
      }
      throw error;
    }
  }
}

HTTP Status Codes

Common status codes returned by the SIMOSphere AI API:

Code	Meaning	Action
`200`	Success	Request completed successfully.
`400`	Bad Request	Check request format and parameters. Do not retry.
`401`	Unauthorized	Invalid or missing API key. Verify your credentials.
`403`	Forbidden	Insufficient permissions. Check the key's scopes.
`429`	Rate Limited	Wait for the Retry-After duration, then retry.
`500`	Server Error	Retry with exponential backoff. Contact support if persistent.
`503`	Service Unavailable	Model backend is temporarily unavailable. Retry shortly.

Back to Documentation