Skip to main content
Last updated on Jan 16, 2026

Supported Data Sources and Platforms

Introduction

DataFuse AI offers robust support for a wide range of data sources and platforms, making it a versatile tool for managing, processing, and analyzing data from various environments. This guide aims to provide a detailed, structured, and comprehensive overview of the different data sources, databases, cloud platforms, file systems, and integration possibilities available within DataFuse AI. It covers the technologies behind these sources, key concepts, and how DataFuse AI facilitates seamless connections to each platform.

1. Relational Databases (RDBMS)

What is an RDBMS?

A Relational Database Management System (RDBMS) is designed to store data in a structured format using tables, rows, and columns. RDBMSs use SQL (Structured Query Language) to define, manipulate, and query data, ensuring data integrity, consistency, and relationships between different entities. RDBMSs are ideal for applications where structured, transactional data management is necessary, such as financial applications or customer management systems.

Common RDBMS Platforms:

  • PostgreSQL

    • Type: Open-source, object-relational database
    • Features: Advanced SQL compliance, support for complex queries, strong extensibility
    • Use Cases: Web applications, analytics, geospatial data
  • MySQL

    • Type: Open-source, relational database
    • Features: Known for speed and reliability
    • Use Cases: Websites, e-commerce, and applications requiring quick data retrieval
  • MSSQL (Microsoft SQL Server)

    • Type: Enterprise-level relational database developed by Microsoft
    • Features: High performance, strong security features, integration with other Microsoft tools
    • Use Cases: Large-scale enterprise applications, data warehousing
  • OracleDB

    • Type: Comprehensive database platform
    • Features: High availability, scalability, and advanced security
    • Use Cases: Large organizations needing enterprise-grade solutions
  • MariaDB

    • Type: Forked from MySQL
    • Features: Enhanced features, performance improvements, security
    • Use Cases: Web applications, cloud environments, open-source projects
  • Snowflake

    • Type: Cloud-native data warehouse platform
    • Features: Scalability, data sharing, integrated data lake
    • Use Cases: Cloud data storage, data analytics
  • Redshift

    • Type: Cloud-based data warehouse (AWS)
    • Features: Optimized for large-scale data processing, scalable
    • Use Cases: Business intelligence, data analysis
  • SAP HANA

    • Type: In-memory relational database
    • Features: Real-time data processing, high performance
    • Use Cases: Real-time analytics, enterprise resource planning (ERP)
  • Vertica

    • Type: High-performance columnar database
    • Features: Advanced analytics, optimized for large datasets
    • Use Cases: Big data analytics, data warehousing
  • Teradata

    • Type: Enterprise data warehousing solution
    • Features: Highly scalable, parallel processing
    • Use Cases: Large-scale data storage and analysis

Integration with DataFuse AI:

DataFuse AI supports various RDBMS drivers to facilitate seamless integration with these databases. Users can leverage these drivers to query, manipulate, and create data pipelines, enabling efficient data processing across different relational platforms.

2. NoSQL Databases

What is a NoSQL Database?

NoSQL databases are designed to handle large volumes of unstructured or semi-structured data. Unlike RDBMS, NoSQL databases allow for flexible schema designs and offer horizontal scalability, making them ideal for big data applications, real-time processing, and applications that require rapid scaling. NoSQL databases support various data models, including key-value stores, document stores, column-family stores, and graph databases.

Common NoSQL Platforms:

  • MongoDB

    • Type: Document-based NoSQL database
    • Features: Stores data in JSON-like format, flexible schema
    • Use Cases: Content management systems, real-time analytics
  • Cassandra

    • Type: Distributed NoSQL database
    • Features: High availability, scalability across commodity hardware
    • Use Cases: Large-scale applications with high write throughput
  • Couchbase

    • Type: Multi-model NoSQL database
    • Features: Document and key-value store with full-text search capabilities
    • Use Cases: Real-time applications, mobile apps
  • Azure Cosmos DB

    • Type: Globally distributed, multi-model NoSQL database
    • Features: Supports document, key-value, graph, and column-family models
    • Use Cases: Global applications, IoT systems, real-time analytics
  • BigQuery

    • Type: Serverless, fully-managed data warehouse (GCP)
    • Features: Scalable, real-time analytics on large datasets
    • Use Cases: Data analysis, machine learning

Integration with DataFuse AI:

DataFuse AI integrates with NoSQL databases through dedicated drivers for platforms like MongoDB, Cassandra, and BigQuery, enabling data ingestion, real-time analytics, and processing for applications that use NoSQL data models.

3. Cloud Platforms

What are Cloud Platforms?

Cloud platforms provide infrastructure as a service (IaaS), enabling users to host, store, and process data over the internet. These platforms offer scalability, flexibility, and managed services for databases and storage systems, providing cost-effective and high-availability solutions for data management.

Common Cloud Platforms:

  • AWS (Amazon Web Services)

    • Services:

      • Amazon RDS: Managed relational databases (MySQL, PostgreSQL, MariaDB, Oracle)
      • Amazon S3: Object storage for large datasets
    • Use Cases: Cloud data storage, scalable computing resources, managed database services

  • Azure (Microsoft)

    • Services:

      • Azure SQL Database: Managed SQL Server relational database
      • Azure Cosmos DB: Multi-model NoSQL database
      • Azure Blob Storage: Object storage for unstructured data
    • Use Cases: Data storage, cloud applications, enterprise solutions

  • GCP (Google Cloud Platform)

    • Services:

      • BigQuery: Serverless data warehouse for big data analysis
      • Google Cloud Storage: Object storage service for data and backups
    • Use Cases: Data analytics, large-scale storage solutions

Integration with DataFuse AI:

DataFuse AI supports integration with cloud storage services like AWS S3, Azure Blob Storage, and GCP Cloud Storage, as well as managed cloud databases such as AWS RDS and Azure SQL Database, enabling seamless data workflows and cloud-based analytics.

4. File Systems

What is a File System?

A File System enables the storage, retrieval, and management of files on a computer or storage device. In the context of cloud and distributed systems, file systems are critical for managing large datasets, backups, and ensuring efficient file transfers across systems.

Common File Systems:

  • FTP (File Transfer Protocol)

    • Type: Standard network protocol for transferring files over TCP/IP
    • Use Cases: Basic file transfers over the internet
  • SFTP (Secure File Transfer Protocol)

    • Type: Secure version of FTP, encrypting file transfers over SSH
    • Use Cases: Secure file exchanges, compliance with data security standards
  • S3 (Amazon Simple Storage Service)

    • Type: Object storage service
    • Use Cases: Storing large volumes of unstructured data, backups, and media files

Integration with DataFuse AI:

DataFuse AI supports FTP, SFTP, and S3 integrations, allowing users to securely transfer and manage files, as well as interact with cloud-based storage systems for large datasets.

5. Upload Services

What is Upload?

The Upload service allows users to manually upload data files into the system for processing or integration with other data sources. This feature is essential for incorporating external datasets into DataFuse AI’s workflows.

Integration with DataFuse AI:

DataFuse AI offers an Upload driver to facilitate the uploading of various file types (e.g., CSV, JSON), making it easy for users to import and use data within the platform.

6. Supported Platforms and Data Sources Overview

Here’s a quick reference table summarizing the supported data sources and platforms available in DataFuse AI:

Data Sources

Data SourcePlatformCategory
BigQueryGCPNoSQL
CassandraNoSQLNoSQL
CouchbaseNoSQLNoSQL
MongoDBNoSQLNoSQL
MSSQLRDBMSRDBMS
OracleDBRDBMSRDBMS
PostgreSQLRDBMSRDBMS
MySQLRDBMSRDBMS
SnowflakeRDBMSRDBMS
RedshiftRDBMSRDBMS
SAP HANARDBMSRDBMS
S3AWSCloud Storage
FTPFTPFile System
SFTPSFTPFile System
Azure CosmosDBAzureNoSQL
Azure SQL ServerAzureRDBMS
RDS MariaDBAWSRDBMS
RDS MySQLAWSRDBMS
RDS PostgreSQLAWSRDBMS
Azure PostgreSQLAzureRDBMS

7. Conclusion

Understanding the various data sources and platforms supported by DataFuse AI is essential for creating efficient data workflows. By integrating relational and NoSQL databases, cloud storage, and file systems, DataFuse AI enables seamless data processing and analysis. With its comprehensive toolset and diverse integration capabilities, DataFuse AI simplifies data management, ensuring users can work with a wide variety of platforms for optimized analytics and processing.