Dev tools 20 results · Refreshed monthly · Free preview

Companies building inference APIs for large language models with 20-100 employees

Explore companies matching “companies building inference APIs for large language models with 20-100 employees” through Canonical’s structured public company search.

Query

Companies building inference APIs for large language models with 20-100 employees

Refine this query
Company Description: inference APIs for large language models Employee Size Range: 11-50 employees

20 results · Refreshed monthly

Canonical uses these criteria to surface a broader shortlist, including niche and emerging companies that are easy to miss with generic search.

Top 20 results

Export
Company Domain Description HQ Location
Inception Labs logo Inception Labs
inceptionlabs.ai Inception Labs is an artificial intelligence company specializing in the development of advanced Large Language Models (LLMs). They are known for their "Diffusion LLMs," which aim to provide breakthroughs in speed and quality for AI applications, particularly in coding assistance. Palo Alto, California, United States
M
Martian
withmartian.com Martian provides a platform that optimizes the use of Large Language Models (LLMs) for developers and companies. Their solution aims to enhance AI integration by dynamically routing prompts to the most suitable LLM, thereby improving performance, reducing costs, and ensuring compliance. San Francisco, California, United States
BentoML logo BentoML
bentoml.com BentoML provides a comprehensive platform for AI inference, enabling organizations to deploy, manage, monitor, and optimize their AI models efficiently. Their solution focuses on delivering speed and control for AI inference at scale, simplifying complex infrastructure challenges. San Francisco, California, United States
M
Maitai
trymaitai.ai Trymaitai provides enterprise-grade Large Language Models (LLMs) with industry-leading inference speeds and low latency. They focus on delivering reliable and efficient AI solutions for businesses looking to deploy and scale AI agents and LLMs. San Francisco, California, United States
Neural Magic logo Neural Magic
neuralmagic.com NeuralMagic offered B2B software and SaaS solutions for optimizing AI model inference, particularly for large language models, across CPU and GPU architectures. They served enterprise clients and developers in various industries seeking to improve the efficiency and scalability of AI deployments by providing tools like DeepSparse and Neural Magic Compress, leveraging proprietary techniques such as GPTQ and SparseGPT for enhanced performance. Their revenue model combined licensing fees and subscription revenue. Somerville, Massachusetts, United States
Deep Infra logo Deep Infra
deepinfra.com DeepInfra provides developer-friendly APIs for AI inference, focusing on performance and cost-efficiency. Their platform enables businesses to accelerate AI deployment, scale to trillions of tokens, and host AI models efficiently. Palo Alto, California, United States
S
Siliconflow
siliconflow.cn SiliconFlow is an AI company focused on accelerating AI model deployment and inference. They provide a suite of AI services designed to enhance performance and reduce costs for AI applications. Their mission is to make advanced AI more accessible and efficient for developers and enterprises. Singapore, Central Region, Singapore
Vessl AI logo Vessl AI
vessl.ai Vessl.ai offers a Platform-as-a-Service (PaaS) solution for deploying and managing large language models (LLMs) and other AI workloads, using a usage-based revenue model likely charging per GPU hour or based on resource consumption. Targeting developers and businesses globally, Vessl.ai simplifies AI model deployment by abstracting away complex infrastructure, supporting LLMs like Llama 3.2 and integrating with technologies such as LlamaParse and Pinecone. The platform emphasizes ease of use, scalability, and cost-effectiveness within a competitive market. San Francisco, California, United States
A
Adaptive Ml
adaptive-ml.com AdaptiveML provides the "Adaptive Engine," a platform designed to evaluate, tune, and serve large language models (LLMs) for enterprise applications. They focus on accelerating the production deployment of AI models and offering advanced fine-tuning capabilities. Paris, Île-de-France, France
Chai logo Chai
chai-research.com Chai Research is building a platform for Social AI, providing tools and infrastructure for AI development and deployment. Their focus is on creating AI that is both informative and engaging, particularly for applications involving large language models. Palo Alto, California, United States
Embedded LLM logo Embedded LLM
embeddedllm.com Embedded LLM provides JamAI Base, a platform designed to accelerate and secure AI workflows, particularly for Large Language Models (LLMs). Their solution focuses on optimizing LLM pipelines for businesses and enterprises. Singapore, Central Region, Singapore
R
Recursal.ai
featherless.ai Featherless AI provides developers and businesses with access to a vast library of over 12,100 open AI models through an API. Their platform enables instant deployment for inference, fine-tuning, testing, and production, aiming to democratize AI model utilization. San Francisco, California, United States
Inceptron logo Inceptron
inceptron.io Inceptron provides a platform for building, optimizing, and deploying AI models, focusing on compiler-driven performance to achieve significant cost efficiencies and lower latency. Their solution aims to make AI model deployment more accessible and cost-effective for businesses and developers.
Super Protocol logo Super Protocol
superprotocol.com Super Protocol provides a confidential computing platform that enables secure AI inference, particularly for Large Language Models (LLMs). Their solution allows businesses and developers to integrate advanced AI capabilities into their products while ensuring the privacy and security of sensitive data. New York, New York, United States
T
Tensorfuse
tensorfuse.io TensorFuse provides a platform for fine-tuning, deploying, and auto-scaling generative AI models on AWS. It offers serverless inference, job queues, and development containers to streamline the AI model lifecycle for developers and organizations. San Francisco, California, United States
ElastixAi logo ElastixAi
elastix.ai Elastix.ai provides a next-generation AI inference platform designed for businesses looking to deploy and scale AI applications efficiently. Their platform focuses on delivering breakthrough total cost of ownership (TCO) per token, dynamic adaptability, and continuous evolution to meet the demands of evolving AI use Seattle, Washington, United States
D
Doubleword
doubleword.ai Doubleword provides an LLMOps platform for enterprises to deploy and manage private, production-grade Generative AI (GenAI) APIs. Their solution allows businesses to run open-source and custom language models securely at scale, supporting various deployment environments. London, England, United Kingdom
A
Ai/Ml Api
aimlapi.com AIMLAPI provides developers and businesses with API access to a vast library of over 300 AI models. Their platform enables seamless integration of AI capabilities, including chat, content generation, and data analysis, into various applications and services.
Z
Zml
zml.ai ZML provides a high-performance AI model inference platform optimized for any model and any hardware. Their solution simplifies deployment and enhances performance for businesses deploying AI models in production.
G
Gaianet
gaianet.ai GaiaNet is building a decentralized ecosystem for the development, deployment, and scaling of AI applications. Their platform aims to foster a collaborative environment where AI models can learn, improve, and grow, offering a more open and scalable alternative to traditional centralized AI infrastructure. Berkeley, California, United States

Sign up to access the full result set, export, and run custom queries

Start free

Go deeper

Run your own company query with custom filters

This page is a free preview. Sign up to modify criteria, refresh results, and export your shortlist.

Related searches

Why this is useful

This page helps teams discover companies tied to a specific capability or workflow.

What Canonical interprets

Canonical turns the use-case query into structured criteria that can surface long-tail company matches beyond simple keyword search.

How to adapt for your use case

Use this example as a starting point, then refine by segment, geography, company size, customer type, or funding signal after signup.