AskTable
sidebar.freeTrial

From Zero to One: Building an Enterprise AI Data Analysis System in 1 Hour (Complete Tutorial)

AskTable Team
AskTable Team 2026-03-03

Most enterprises face this dilemma: data exists in databases, but business personnel can't query it; technical personnel can query it, but they're too busy. Traditional BI tools have high learning costs and complex deployment, often bought and then left unused.

This article will teach you step-by-step how to build an enterprise-level AI data analysis system with AskTable in 1 hour, enabling business personnel to query data directly using natural language, without learning SQL or relying on technical teams.

Tutorial Goals

After completing this tutorial, you will achieve:

✅ Connect to enterprise databases (MySQL/PostgreSQL/SQL Server, etc.) ✅ Configure business semantic layer so AI understands business terminology ✅ Query data using natural language, such as "Top 10 products by sales this month" ✅ Set up permission control to ensure data security ✅ Create reusable data analysis templates

Time required: 60 minutes Technical requirements: No programming foundation needed, just know how to use Excel Applicable scenarios: Small to medium enterprises, startup teams, department-level data analysis

Preparation (5 minutes)

1. Register for AskTable Account

Visit AskTable Official Website, click "Free Trial" to register.

Choose deployment method:

  • Cloud SaaS: Suitable for quick experience, no deployment needed, ready to use upon registration
  • Private deployment: Suitable for enterprises with data security requirements, requires servers

This tutorial uses Cloud SaaS as an example.

2. Prepare Data Source Information

You need to prepare the following information (using MySQL as example):

  • Database address: Such as db.example.com:3306
  • Database name: Such as sales_db
  • Username: Such as readonly_user
  • Password: Database password

Security suggestions:

  • Create a read-only account, don't use the administrator account
  • Only authorize access to necessary tables
  • If the database is on an internal network, need to configure whitelist or VPN

Don't have existing data? You can use our example database:

  • Address: demo.asktable.com:3306
  • Database: ecommerce_demo
  • Username: demo_user
  • Password: demo2026

Step 1: Connect Data Source (10 minutes)

1. Create Data Source Connection

After logging into AskTable, go to the "Data Sources" page:

  1. Click "Add Data Source"
  2. Select database type (MySQL)
  3. Fill in connection information:
    Connection name: Sales Database
    Host address: db.example.com
    Port: 3306
    Database name: sales_db
    Username: readonly_user
    Password: ********
    
  4. Click "Test Connection"
  5. After successful connection, click "Save"

Common issues:

Q: Connection failed, prompt "Cannot connect to database" A: Check the following:

  • Is the database address and port correct?
  • Does the database allow external network access? (check firewall)
  • Is the username and password correct?
  • Do you need to add AskTable's IP to the whitelist?

Q: My database is on an internal network and cannot be accessed externally A: There are two solutions:

  • Option 1: Use AskTable's private deployment version
  • Option 2: Configure internal network penetration or VPN

2. Select Tables to Analyze

After successful connection, AskTable will automatically read all tables in the database:

  1. Browse the table list and select tables needed for analysis
  2. Check related tables (such as orders, products, customers)
  3. Click "Sync Metadata"

Tips:

  • It's recommended to first select 3-5 core tables, not all at once
  • You can add new tables anytime

3. View Table Structure

Click on a table (such as orders) to view the table structure:

Table name: orders (Orders Table)
Fields:
- order_id (int): Order ID
- user_id (int): User ID
- product_id (int): Product ID
- amount (decimal): Order amount
- status (varchar): Order status
- created_at (datetime): Creation time
- paid_at (datetime): Payment time

Table relationships: AskTable will automatically identify foreign key relationships, such as:

  • orders.user_idcustomers.user_id
  • orders.product_idproducts.product_id

If automatic identification is inaccurate, you can manually configure table relationships.

Step 2: Configure Business Semantic Layer (20 minutes)

The business semantic layer is the key to enabling AI to understand business language. We need to tell AI:

  • What "sales" means
  • What "this month" refers to
  • What the definition of "active user" is

1. Define Core Metrics

Go to the "Semantic Layer" page and click "Add Metric":

Example 1: Sales (GMV)

Metric name: Sales
English name: GMV
Description: Total amount of paid orders
Calculation method: Aggregation
Aggregation function: SUM
Field: orders.amount
Filter conditions:
  - orders.status IN ('paid', 'completed')
Unit: Yuan
Synonyms:
  - Revenue
  - Transaction Volume
  - Total Sales

Example 2: Order Volume

Metric name: Order Volume
English name: Order Count
Description: Total number of orders
Calculation method: Count
Aggregation function: COUNT
Field: orders.order_id
Filter conditions:
  - orders.status != 'cancelled'
Synonyms:
  - Order Count
  - Number of Transactions

Example 3: Average Order Value

Metric name: Average Order Value
English name: AOV (Average Order Value)
Description: Average amount per order
Calculation method: Custom
SQL expression: SUM(amount) / COUNT(DISTINCT order_id)
Data table: orders
Filter conditions:
  - orders.status IN ('paid', 'completed')
Unit: Yuan
Synonyms:
  - Average Order Value
  - Single Price Average

Example 4: Monthly Active Users (Complex Metric)

Metric name: Monthly Active Users
English name: MAU
Description: Number of deduplicated users with at least one login or purchase behavior in the past 30 days
Calculation method: Custom SQL
SQL definition: |
  SELECT COUNT(DISTINCT user_id) as mau
  FROM (
    SELECT user_id, login_time as action_time
    FROM user_login_logs
    WHERE login_time >= DATE_SUB(NOW(), INTERVAL 30 DAY)

    UNION

    SELECT user_id, created_at as action_time
    FROM orders
    WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
  ) AS active_users
Synonyms:
  - Monthly Active
  - MAU

2. Define Dimensions

Dimensions are perspectives for data analysis, used for grouping and filtering.

Example 1: Time Dimension

Dimension name: Order Date
Field: orders.created_at
Type: Date time
Supported granularities:
  - Day: DATE(created_at)
  - Week: YEARWEEK(created_at)
  - Month: DATE_FORMAT(created_at, '%Y-%m')
  - Year: YEAR(created_at)

Predefined time ranges:
  - Today: created_at >= CURDATE()
  - Yesterday: DATE(created_at) = DATE_SUB(CURDATE(), INTERVAL 1 DAY)
  - This week: created_at >= DATE_SUB(CURDATE(), INTERVAL WEEKDAY(CURDATE()) DAY)
  - This month: created_at >= DATE_FORMAT(CURDATE(), '%Y-%m-01')
  - Last month: created_at >= DATE_SUB(DATE_FORMAT(CURDATE(), '%Y-%m-01'), INTERVAL 1 MONTH)
         AND created_at < DATE_FORMAT(CURDATE(), '%Y-%m-01')

Example 2: Product Category Dimension

Dimension name: Product Category
Field: products.category
Type: Text
Hierarchy:
  - Level 1 category: category_level1
  - Level 2 category: category_level2
Possible values:
  - Electronics
  - Clothing
  - Food
  - Books

Example 3: User Region Dimension

Dimension name: User Region
Field: customers.region
Type: Text
Hierarchy:
  - Major region: region_level1 (East China, North China, South China, etc.)
  - Province: region_level2 (Beijing, Shanghai, Guangdong, etc.)
  - City: region_level3 (Beijing City, Shanghai City, Guangzhou City, etc.)

3. Define Business Rules

Business rules encapsulate complex business logic.

Example: Valid Order Rule

Rule name: Valid Order
Description: Orders meeting the following conditions are considered valid
Conditions:
  - status IN ('paid', 'completed', 'shipped')
  - amount > 0
  - user_id > 10000  # Exclude test users
  - created_at >= '2024-01-01'  # Only count data after 2024
SQL snippet: |
  WHERE status IN ('paid', 'completed', 'shipped')
    AND amount > 0
    AND user_id > 10000
    AND created_at >= '2024-01-01'

Application scenarios: All metrics involving order statistics automatically apply this rule to ensure consistent metrics.

4. Test Semantic Layer Configuration

After configuration, test it:

Question: "What is this month's sales?"

AI understands:

  • "This month" → created_at >= DATE_FORMAT(CURDATE(), '%Y-%m-01')
  • "Sales" → SUM(amount), automatically applies "Valid Order" rule

Generated SQL:

SELECT SUM(amount) as gmv
FROM orders
WHERE status IN ('paid', 'completed', 'shipped')
  AND amount > 0
  AND user_id > 10000
  AND created_at >= DATE_FORMAT(CURDATE(), '%Y-%m-01')

Returned result:

This month's sales: ¥1,234,567

Step 3: Start Natural Language Queries (15 minutes)

After configuring the semantic layer, you can start querying data using natural language.

Basic Queries

Question 1: "What is yesterday's order volume?"

AI generates:

SELECT COUNT(*) as order_count
FROM orders
WHERE DATE(created_at) = DATE_SUB(CURDATE(), INTERVAL 1 DAY)
  AND status != 'cancelled'

Question 2: "Daily sales this week"

AI generates:

SELECT
  DATE(created_at) as date,
  SUM(amount) as gmv
FROM orders
WHERE created_at >= DATE_SUB(CURDATE(), INTERVAL WEEKDAY(CURDATE()) DAY)
  AND status IN ('paid', 'completed')
GROUP BY DATE(created_at)
ORDER BY date

Returned result: Automatically generates a line chart showing trends.

Comparison Queries

Question 3: "Compare this month's sales with last month"

AI generates:

SELECT
  CASE
    WHEN created_at >= DATE_FORMAT(CURDATE(), '%Y-%m-01') THEN 'This month'
    ELSE 'Last month'
  END as period,
  SUM(amount) as gmv
FROM orders
WHERE created_at >= DATE_SUB(DATE_FORMAT(CURDATE(), '%Y-%m-01'), INTERVAL 1 MONTH)
  AND status IN ('paid', 'completed')
GROUP BY period

Returned result:

This month: ¥1,234,567
Last month: ¥1,100,000
Growth rate: +12.2%

Grouping Queries

Question 4: "Sales proportion by product category"

AI generates:

SELECT
  p.category,
  SUM(o.amount) as gmv,
  SUM(o.amount) / (SELECT SUM(amount) FROM orders WHERE status IN ('paid', 'completed')) * 100 as percentage
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.status IN ('paid', 'completed')
GROUP BY p.category
ORDER BY gmv DESC

Returned result: Automatically generates a pie chart showing proportion by category.

Ranking Queries

Question 5: "Top 10 products by sales"

AI generates:

SELECT
  p.product_name,
  SUM(o.amount) as gmv
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.status IN ('paid', 'completed')
GROUP BY p.product_id, p.product_name
ORDER BY gmv DESC
LIMIT 10

Multi-turn Conversation

Round 1: "Sales by region this month" Return: East China ¥500K, North China ¥400K, South China ¥300K...

Round 2: "Only East China" AI understands: Continue from previous query, add filter region = 'East China'

Round 3: "Group by city" AI understands: On the basis of East China, group by city

This multi-turn conversation capability makes data exploration smoother.

Step 4: Set Up Permission Control (10 minutes)

Data security is crucial, especially when involving customer information and financial data.

1. Create User Roles

Go to the "Permission Management" page and create different roles:

Role 1: Sales Personnel

Role name: Sales Personnel
Permission scope:
  Accessible data sources: Sales Database
  Accessible tables:
    - orders (can only see orders in their own region)
    - customers (can only see customers in their own region)
    - products (all visible)
  Row-level permissions:
    - orders: region = :user_region
    - customers: region = :user_region
  Column-level permissions:
    - customers.phone: Masked display (138****5678)
    - customers.id_card: Not visible

Role 2: Operations Personnel

Role name: Operations Personnel
Permission scope:
  Accessible data sources: Sales Database
  Accessible tables: All visible
  Row-level permissions: No restrictions
  Column-level permissions:
    - customers.phone: Masked display
    - customers.id_card: Masked display
  Prohibited operations:
    - Batch export customer data (single export limited to 100 records)

Role 3: Management

Role name: Management
Permission scope: All permissions
Row-level permissions: No restrictions
Column-level permissions: All visible
Allowed operations: All

2. Add Users and Assign Roles

  1. Click "Add User"
  2. Fill in user information:
    Name: Zhang San
    Email: zhangsan@company.com
    Role: Sales Personnel
    Custom attributes:
      region: East China  # Used for row-level permission filtering
    
  3. After saving, the system automatically sends an invitation email

3. Test Permissions

Log in with "Zhang San's" account and ask: "This month's order volume"

Automatically applied permission filtering:

SELECT COUNT(*) as order_count
FROM orders
WHERE created_at >= DATE_FORMAT(CURDATE(), '%Y-%m-01')
  AND region = 'East China'  # Automatically added permission filter

Zhang San can only see East China region data and cannot see national data.

4. Data Masking Configuration

For sensitive fields, configure masking rules:

Phone number masking:

Field: customers.phone
Masking rule: CONCAT(LEFT(phone, 3), '****', RIGHT(phone, 4))
Example: 13812345678 → 138****5678

ID card masking:

Field: customers.id_card
Masking rule: CONCAT(LEFT(id_card, 6), '********', RIGHT(id_card, 4))
Example: 110101199001011234 → 110101********1234

Address masking:

Field: customers.address
Masking rule: CONCAT(SUBSTRING(address, 1, 10), '***')
Example: Beijing Chaoyang District Jianguomenwai Street No. 1 → Beijing Chaoyang District Jianguomen***

Step 5: Create Reusable Templates (5 minutes)

For commonly used queries, you can create templates for quick access.

1. Save Query as Template

Ask: "Sales and order volume by product category this month"

After successful query, click "Save as Template":

Template name: Monthly Product Category Analysis
Description: Statistics of sales and order volume by product category this month
Parameters:
  - Time range: Optional (default this month)
  - Product category: Optional (default all)
Sharing scope: Visible to entire company

2. Use Template

Other users can find this template in the "Template Library":

  1. Click "Monthly Product Category Analysis"
  2. (Optional) Adjust parameters, such as selecting "Last month"
  3. Click "Run" to immediately get results

3. Scheduled Reports

Set up scheduled sending:

Report name: Daily Sales Brief
Query template: Monthly Product Category Analysis
Sending frequency: Daily at 9:00 AM
Recipients:
  - zhangsan@company.com
  - lisi@company.com
Sending method: Email
Format: PDF + Excel

Common Issues and Solutions

Q1: AI-generated SQL is inaccurate, what should I do?

Cause: Semantic layer configuration is incomplete, AI understanding has deviations.

Solution:

  1. Check if metric definitions are clear
  2. Add synonyms to help AI understand different expressions
  3. If a certain query often has errors, you can create "Example Queries" to teach AI the correct understanding

Example:

Example Queries:
  Question: "New users this month"
  Correct SQL: |
    SELECT COUNT(DISTINCT user_id)
    FROM users
    WHERE DATE_FORMAT(created_at, '%Y-%m') = DATE_FORMAT(CURDATE(), '%Y-%m')
  Note: "New users" refers to users whose registration time is this month, not active users

Q2: Query is slow, what should I do?

Causes:

  • Large data volume
  • Indexes not established
  • SQL query is complex

Solutions:

  1. Establish indexes for commonly used fields in the database
  2. For complex queries, consider establishing pre-aggregated tables
  3. Use AskTable's caching functionality
  4. Limit the number of query results (e.g., return at most 10,000 rows)

Q3: How to handle data update delays?

Problem: Data in the database updates in real-time, but AskTable query results have delays.

Solutions:

  • Real-time queries: Each query directly accesses the database (default method)
  • Scheduled refresh: For large data volume scenarios, you can set scheduled metadata refresh (e.g., once per hour)
  • Manual refresh: After important data updates, manually trigger metadata synchronization

Q4: How to handle multi-data source associated queries?

Problem: Order data is in MySQL, user behavior data is in ClickHouse, how to associate?

Solutions:

  • Option 1: Configure cross-data source associations in AskTable (requires Enterprise edition)
  • Option 2: Establish data synchronization at the database level (e.g., using ETL tools)
  • Option 3: Unify data through a data warehouse

Advanced Tips

1. Custom Charts

AskTable automatically selects chart types by default, but you can customize:

Query: "Sales by region"
Chart type: Map
Configuration:
  Geographic field: region
  Numeric field: gmv
  Color scheme: Blue gradient

2. Export and Share

Export:

  • Export as Excel (includes raw data)
  • Export as PDF (includes charts)
  • Export as image (charts only)

Share:

  • Generate share links (set validity period and access password)
  • Embed in web pages or internal systems
  • Scheduled email sending

3. API Integration

If you have development capabilities, you can integrate AskTable through API:

import requests

# Initiate query
response = requests.post(
    'https://api.asktable.com/v1/query',
    headers={'Authorization': 'Bearer YOUR_API_KEY'},
    json={
        'question': 'This month\'s sales',
        'datasource_id': 'your_datasource_id'
    }
)

result = response.json()
print(f"Sales: {result['data']['gmv']}")

Application scenarios:

  • Embed data query functionality in internal systems
  • Automated report generation
  • Data monitoring and alerting

Summary

Congratulations! You have completed building an enterprise-level AI data analysis system.

Review:

  • ✅ Connect data sources (10 minutes)
  • ✅ Configure business semantic layer (20 minutes)
  • ✅ Natural language queries (15 minutes)
  • ✅ Permission control (10 minutes)
  • ✅ Create templates (5 minutes)

Total time: 60 minutes

Next steps:

  1. Gradually add more metrics and dimensions
  2. Invite team members to use
  3. Collect feedback and optimize semantic layer configuration
  4. Explore more advanced features

Core value:

  • Efficiency improvement: From "submit requirement → wait → get data" taking 3 days to just 30 seconds
  • Lowered barriers: Business personnel can query without learning SQL, using natural language
  • Unified metrics: Business semantic layer ensures the entire company uses the same indicator definitions
  • Data security: Refined permission control, sensitive data automatically masked

Actual effects (from real customers):

"We are a 50-person e-commerce team. Previously we had 10+ temporary data needs every day, and the data team was exhausted. After introducing AskTable, 70% of needs were completed by business personnel independently, and the data team was freed from 'data fetching' work to focus on higher-value analysis." — CTO of an e-commerce company

Start your data-driven journey:

  • Visit AskTable Official Website for free trial
  • Join the user community to exchange usage experiences
  • Watch video tutorials to learn more tips

Let data analysis return to its essence: Simple, fast, usable by everyone.


Related Resources:

cta.readyToSimplify

sidebar.noProgrammingNeededsidebar.startFreeTrial

cta.noCreditCard
cta.quickStart
cta.dbSupport