Vectice Docs
API Reference (Latest)Vectice WebsiteStart Free Trial
Latest
Latest
  • 🏠Introduction
    • Vectice overview
      • Autolog
      • Next-Gen Autolog [BETA]
      • AskAI
      • Vectice for financial services
  • 🏁Quickstart
    • Getting started
    • Quickstart project
    • Tutorial project
    • FAQ
  • ▢️Demo Center
    • Feature videos
  • πŸ“ŠManage AI/ML projects
    • Organize workspaces
      • Create a workspace
      • Workspace Dashboard
    • Organize projects
      • Create a project
      • Project templates best practices
    • Invite colleagues
    • Define phase requirements
    • Collaborate with your team
  • πŸš€Log and Manage Assets with Vectice API
    • API cheatsheets
      • Vectice Python API cheatsheet
      • Vectice R API cheatsheet
    • Connect to API
    • Log assets to Vectice
      • Autolog your assets
      • Log datasets
      • Log models
      • Log attachments and notes
      • Log code
      • Log a custom data source
      • Log assets using Vectice IDs
      • Log dataset structure and statistics
      • Log custom metadata in a table format
      • Log MLFLow runs
    • Retrieve assets from app
    • Manage your assets
    • Manage your iteration
    • Preserve your code and asset lineage
  • 🀝Create Model documentation and reports
    • Create model documentation with Vectice Reports
    • Streamline documentation with Macros
    • Auto-document Models and Datasets with AskAI Prompts
    • Document phase outcomes
  • πŸ—‚οΈAdmin Guides
    • Organization management
    • Workspace management
    • Teams management
    • User management
      • User roles and permissions
      • Update a user role in your organization
      • Activate and deactivate users
      • Reset a user's password
    • Manage report templates
  • πŸ”—Integrations
    • Integrations Overview
    • Integrate Vectice with your data platform
  • πŸ’»IT & Security
    • IT & Security Overview
    • Secure Evaluation Environment Overview
    • Deployment
      • SaaS offering (Multi-Tenant SaaS)
      • Kubernetes self-hosted offering
        • General Architecture & Infrastructure
        • Kubernetes on GCP
          • Appendices
        • Kubernetes on AWS
          • Appendices
        • Kubernetes on Azure
          • Appendices
        • GCP Marketplace deployment
        • On premise
        • Configuration
      • Bring Your Own LLM Guide
    • Data privacy
    • User management
    • SSO management
      • Generic SAML integration
      • Okta SSO integration
    • Security
      • Data storage security
      • Network Security
        • HTTPS communication
        • Reverse proxy
        • CORS/CSRF
        • VPC segregation
      • Sessions
      • Secrets and certificates
      • Audit logs
      • SOC2
      • Security updates
      • Best practices
      • Business continuity
    • Monitoring
      • Installation guide
      • Customizing the deployments
    • Maintenance & upgrades
    • Integrating Vectice Securely
  • ⭐Glossary
    • Concepts
      • Workspaces
      • Projects
        • Setup a project
      • Phases
      • Iterations
        • Iterative development
      • Datasets
        • Dataset resources
        • Dataset properties
        • Dataset lineage and versions
      • Models
      • Reports
  • 🎯Release notes
    • Release notes
  • ↗️References
    • Vectice Python API Reference
    • Vectice R API Cheatsheet
    • Notebooks and code samples
    • Vectice website
Powered by GitBook
On this page
  • Bring your own LLM
  • Benefits of Using Your Own LLM
  • Best Practices for Your LLM Deployment
  • Azure OpenAI Configuration for GPT-4o (Recommended)
  • Cost Calculation per User
  • Common LLMs Used with Vectice

Was this helpful?

  1. IT & Security
  2. Deployment

Bring Your Own LLM Guide

PreviousConfigurationNextData privacy

Last updated 17 days ago

Was this helpful?

This page provides recommendations for configuring your LLM and outlines the cost calculation methodology for the AskAI feature. For general information on AskAI, please refer to .

Bring your own LLM

Vectice supports integration with any LLM of your choice as long as they meet minimum performance requirements (see requirements below). Whether you use commercial APIs or host models privately, Vectice can connect seamlessly. Our team is available to assist with advanced customization needs.

Note: Vectice does not require a multimodal LLM. All current features operate using text-only models.

Benefits of Using Your Own LLM

Bringing your own LLM allows to:

  • Preserve data confidentiality by processing prompts in a secure, private environment

  • Control costs by selecting the model and infrastructure that fit your usage profile

  • Meet compliance and IT policies with regional hosting or on-premise setups

  • Fine-tune behavior using domain-specific data and enterprise instructions

  • Ensure long-term flexibility across proprietary and open-source model providers

Best Practices for Your LLM Deployment

To ensure a smooth experience and prevent service interruptions in Vectice, especially under concurrent usage, we recommend:

Token Throughput

  • Support a throughput of 450,000 tokens per minute

  • This baseline ensures stable usage for ~5 to 10 active users

  • Applies to both input and output tokens

Recommended Model Capabilities

  • Your LLM should match or exceed the performance of GPT-4o-mini on reasoning, summarization, and code analysis tasks.

  • Minimum equivalent model size: 7B–8B parameters trained on high-quality data (e.g., LLaMA 3.2+ 8B+, Mixtral 8x22B)

Azure OpenAI Configuration for GPT-4o (Recommended)

If using GPT-4o or GPT-4o-mini on Azure, configure content filtering to avoid macro execution errors:

  1. Go to Safety + Security β†’ Content Filters

  2. Create a dedicated filter for Vectice usage

  3. Disable the following input filters:

    • Prompt shields for jailbreak attacks

    • Prompt shields for indirect attacks

  4. Associate the filter with your GPT-4o or 4o-mini deployment

Cost Calculation per User

Standard usage for a user actively documenting model development or validation is about 1.3 Millions tokens per month. For reference, this is the current pricing for LLMs (April 2025).

GPT 4o
GPT 4o-mini
Claude 3.7 Sonnet
LLaMA 70B
Mixtral 8x22B+

Cost per Million token

$ 5

$ 0.6

$ 3

$ 0.72

$ 2

Monthly Estimated Cost per User

< $ 7

< $ 1

< $ 4

< $ 1

< $ 3

Common LLMs Used with Vectice

Enterprises typically integrate the following models based on their deployment strategy and governance needs:

Model
Provider
Parameters
Deployment Mode

GPT-4o / GPT 4o-mini+

OpenAI (Azure)

N/A

Cloud (Azure-hosted) / Self-hosted

Claude 3.7+ Sonnet

Anthropic

N/A

Cloud (Amazon Bedrock) / Self-hosted

LLaMA 3.3+ 70B

Meta

70B

Cloud (Amazon Bedrock) / Self-hosted

Mixtral 8x22B+

Mistral AI

8x22B

Cloud (Amazon Bedrock) / Self-hosted

Need help evaluating which LLM works best for your use case? Reach out to our team at support@vectice.com.

For the latest pricing, refer to and .

πŸ’»
the dedicated documentation
OpenAI’s API pricing page
Amazon's Bedrock princing page