Skip to content
Tutorials

Using Custom Models with NeuralFlow

Three ways to integrate your own fine-tuned or proprietary AI models into NeuralFlow pipelines.

FN 1 min read
Using Custom Models with NeuralFlow

While NeuralFlow integrates with all major AI providers, sometimes you need to use your own fine-tuned or proprietary models. This guide covers three ways to bring custom models into your NeuralFlow pipelines.

Option 1: Model API Endpoint

If your model is already deployed behind an API, you can connect it to NeuralFlow as a custom endpoint:


from neuralflow import CustomModel

model = CustomModel(
    name="my-fine-tuned-bert",
    endpoint="https://models.mycompany.com/bert-v2/predict",
    auth={"type": "bearer", "token_env": "MY_MODEL_TOKEN"},
    input_schema={"text": "string"},
    output_schema={"label": "string", "score": "number"}
)
    

Option 2: Container Deployment

Upload your model as a Docker container, and NeuralFlow handles scaling, monitoring, and load balancing:


# Package your model
nf model package ./my-model --runtime python3.11 --gpu a100

# Deploy to NeuralFlow's infrastructure
nf model deploy my-fine-tuned-bert:v2 --replicas 3 --gpu-type a100
    

Option 3: Hugging Face Integration

Deploy any Hugging Face model directly:


from neuralflow import HuggingFaceModel

model = HuggingFaceModel(
    repo="myorg/custom-sentiment-v3",
    task="text-classification",
    device="cuda"
)
    

Performance Tips

  • Use GPU instances for transformer models — CPU inference is 10-50x slower
  • Enable request batching for throughput-sensitive pipelines
  • Set up auto-scaling rules based on queue depth, not just CPU usage
  • Cache responses for deterministic inputs to reduce costs