May 23, 2024
Open-source SuperDuperDB brings AI into enterprise databases


Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.


San Francisco-based SuperDuperDB, an Intel Ignite portfolio company working to simplify how teams build and deploy AI apps, today released version 0.1 of its open-source framework.

Available as a Python package, the framework allows users to integrate AI — from machine learning (ML) models to their AI application programming interface (API) of choice — and vector search capabilities with existing databases and build AI applications directly on top of them.

The offering already supports popular AI models and databases and has received $1.75 million in early funding from Hetz.vc, Session.vc and the venture capital arm of data ecosystem major MongoDB. 

“MongoDB’s backing is a testament to the attitude and transformative potential of SuperDuperDB. Our vision is to bridge the gap between data storage systems and AI, making it incredibly simple for organizations to build and manage AI applications by enabling a new era of software development in which data and AI form a close-knit symbiosis,” Timo Hagenow, CEO at SuperDuperDB, said in a statement.

VB Event

The AI Impact Tour

Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!

 

Learn More

The framework is available on Product Hunt starting today.

Solving the AI problem with SuperDuperDB

AI is rapidly becoming a standard technology powering modern enterprise operations, but building applications that tap powerful ML models and proprietary data to deliver business value is no cakewalk.

Even with a wide range of ML models and APIs, developers have to put a lot of effort just to bring them into production. They have to use tools from the “MLOps” and “DevOps” ecosystems to extract data from main databases and move it to specialized vector databases through a series of intricate and fragile pipelines. This takes time and can often delay the launch of projects.

“Startups and innovation in the domain of ‘making AI easier’ have either tended to focus solely on making it easy to deploy algorithms on compute resources or on combining the algorithms and data in a convoluted series of pipelines, in a field known as MLOps,” Hagenow told VentureBeat.

To solve this challenge and give teams an easy to combine their algorithms with the data which infuses it with value, Hagenow and team created SuperDuperDB, a framework that brings AI models —including streaming inference and scalable model training— directly to the database being used by the enterprise, rather than the other way around.

“SuperDuperDB may be installed simply from open-source as a Python package and allows developers to set up a single scalable deployment of all his/ her AI models and APIs, which directly communicates with the database. This transforms the database into a(n) (‘super-duper’) AI development and deployment environment. The environment may be deployed in standalone experimental mode, on a single client, or with scalable compute in a cloud or on-premise environment via Kubernetes, using best-in-class open-source deployment software. This gives end-to-end open-source control to the developer and administrator(s) over algorithms, data, compute and infrastructure,” Hagenow explained. 

Using this offering, developers can use not just standard machine learning models, for applications like classification, regression, and recommendation systems, but also the latest generative AI models to enable LLM-based chat and vector search, as well as highly custom models for specialized use cases. For vector search, it either uses in-database vector functionality provided by database vendors or its own vector-index implementation capabilities. 

Superstrong partner ecosystem

While the product is just a few months old, it has already drawn significant traction from major ecosystem players, giving enterprise teams comprehensive support for popular databases and models.

On the data side, it supports MongoDB, PostgreSQL, MySQL, SQLite, DuckDB, Snowflake, BigQuery, ClickHouse, DataFusion, Druid, Impala, MSSQL, Oracle, pandas, Polars, PySpark, Trino, and s3. Meanwhile, on the AI side, it supports arbitrary models from the Python AI ecosystem, models from PyTorch, Sklearn, Hugging Face and popular AI APIs from vendors such as OpenAI, Anthrophic, and Cohere.

“MongoDB made us an official technology partner and we have already run webinars and live coding sessions with major accounts such as Cisco. We are also currently evaluating a range of POCs with Intel and a few other SMEs,” the CEO said, without sharing specific growth stats. 

Architecture of SuperDuperDB
Architecture of SuperDuperDB

He also noted that the company is moving to expand its ecosystem and is in talks with other major database organizations regarding closer partnerships. The ultimate goal is to integrate seamlessly with enterprise data platforms, such as Databricks and Snowflake. For Snowflake, the company is already planning a native app that will launch on the data cloud major’s marketplace. 

Potential across applications

If this goes mainstream, building and deploying AI applications will become relatively easy for teams, regardless of the sector they are in. 

“By combining SuperDuperDB’s technology with MongoDB Atlas Vector Search, the developer journey to using AI is significantly accelerated. Across many industries, ranging from fraud detection in financial services to supply chain optimization in logistics to novel drug discovery in healthcare and life sciences, developers can now quickly and efficiently build and ship modern applications,” Boris Bialek, the field CTO of industry solutions at MongoDB noted.

There are some in-database AI offerings in the market, including MindsDB and PostgresML, but Hagenow pointed out that they are all SQL-based, which forces developers to adapt and migrate to their SQL dialects. SuperDuperDB, on the other hand, is Python-first – which is the programming language of AI research and development.

“SuperDuperDB provides a simple (and familiar) Python interface but allows experts to drill down to any level of implementation detail such as model weights or training details. What’s more, SuperDuperDB allows developers to work directly with images, video, audio in the database, and any type of data that can be encoded as bytes in Python. There is nothing else like this in AI open-source,” he said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.



Source link