AST-Based SQL Permission Control System - SQLGlot Scope Practice

In a multi-tenant data analysis system, SQL permission control is the core mechanism ensuring data security. Traditional permission control often relies on database-level views or row-level security policies, but this approach has problems like insufficient flexibility and difficulty in dynamic adjustment. AskTable adopts an innovative approach: AST-based SQL permission control system (SQL Guard), which dynamically injects permission conditions before SQL execution through SQLGlot library's Scope analysis capability.

Core Design Philosophy

SQL Guard's design follows the "Clean Architecture" principle, completely separating permission control logic from business logic. Its core idea is:

•Rules as Code: Permission rules defined as SQL expressions, supporting variable templates
•AST-Level Injection: Inject permission conditions at the syntax tree level, ensuring they cannot be bypassed
•Scope Recursive Processing: Handle subqueries, CTEs, UNION and other complex structures
•Wildcard Support: Support * wildcard matching for arbitrary schema or table

Permission Rule Definition

Permission rules use a SQL-like expression syntax, supporting Jinja2 variable templates:

# Example rules
rules = [
    "users.orders.user_id = {{ current_user_id }}",  # Users can only view their own orders
    "*.*.region IN ('CN', 'US')",                     # Restrict region scope
    "sales.*.created_at >= '2024-01-01'",            # Time range restriction
]

Technical Architecture

加载图表中...

Core Implementation Process

1. Rule Parsing and Matching

The rule parser first converts rule strings to AST and identifies schema, table, and field information:

def parse_rule(rule: str, variables: dict, sql: str, dialect: str) -> list[Rule]:
    # 1. Render Jinja2 variables
    template = Template(rule)
    rule_str = template.render(**variables)

    # 2. Replace wildcard * with ALL
    rule_str = rule_str.replace("*", "ALL")

    # 3. Parse rule and SQL to AST
    rule_ast = sqlglot.parse_one(rule_str, read=dialect)
    sql_ast = sqlglot.parse_one(sql, read=dialect)

    # 4. Extract column information from rule
    rule_column = extract_column_from_rule(rule_ast)

    # 5. Traverse all tables in SQL, match rules
    rules = []
    for table in sql_ast.find_all(exp.Table):
        if matches(rule_column, table):
            qualified_ast = qualify_rule_ast(rule_ast, table)
            rules.append(Rule(
                schema=table.db,
                table=table.name,
                field=rule_column.name,
                ast=qualified_ast
            ))

    return rules

2. Scope Recursive Processing

SQLGlot's Scope provides powerful scope analysis capability, accurately identifying table aliases, subqueries, CTEs, and other complex structures:

def _process_scope(self, scope: Scope, rules: list[Rule]):
    # 1. Build permission conditions for current scope
    conditions = self._build_conditions_for_scope(scope, rules)
    if conditions:
        self._inject_conditions(scope, conditions)

    # 2. Recursively process all sub-scopes
    for subscope in self._get_all_subscopes(scope):
        self._process_scope(subscope, rules)

def _get_all_subscopes(self, scope: Scope) -> list[Scope]:
    return (
        scope.derived_table_scopes +  # FROM subqueries
        scope.cte_scopes +             # WITH CTE
        scope.subquery_scopes +        # WHERE/SELECT subqueries
        scope.union_scopes +           # UNION
        scope.udtf_scopes              # Table functions
    )

3. Condition Injection

Condition injection needs to handle table aliases, quotes, and merging with existing WHERE clauses:

def _inject_conditions(self, scope: Scope, conditions: list[exp.Expression]):
    # 1. Find SELECT statement
    select = scope.expression if isinstance(scope.expression, exp.Select) \
             else scope.expression.find(exp.Select)

    # 2. Merge all conditions (deduplicate)
    combined = self._combine_conditions(conditions)

    # 3. Inject into WHERE clause
    where = select.find(exp.Where)
    if where:
        # Connect with existing conditions using AND
        where.set("this", exp.And(this=where.this, expression=combined))
    else:
        # Create new WHERE clause
        select.set("where", exp.Where(this=combined))

Practical Application Examples

Example 1: Multi-Tenant Data Isolation

# Original SQL
sql = """
SELECT o.id, o.amount, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
"""

# Permission rules
rules = ["*.orders.tenant_id = '{{ tenant_id }}'"]
variables = {"tenant_id": "tenant_123"}

# SQL after applying permissions
protected_sql = guard(sql, "postgres", rules, variables)

Generated SQL:

SELECT o.id, o.amount, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
  AND o.tenant_id = 'tenant_123'  -- Auto injected

Example 2: Complex Subquery Processing

sql = """
SELECT *
FROM (
    SELECT user_id, SUM(amount) as total
    FROM orders
    GROUP BY user_id
) subq
WHERE total > 1000
"""

rules = ["*.orders.region = 'CN'"]
protected_sql = guard(sql, "mysql", rules)

Generated SQL:

SELECT *
FROM (
    SELECT user_id, SUM(amount) as total
    FROM orders
    WHERE region = 'CN'  -- Injected inside subquery
    GROUP BY user_id
) subq
WHERE total > 1000

Example 3: CTE and UNION Processing

sql = """
WITH recent_orders AS (
    SELECT * FROM orders WHERE created_at > '2024-01-01'
)
SELECT * FROM recent_orders
UNION ALL
SELECT * FROM orders WHERE status = 'pending'
"""

rules = ["*.orders.user_id = {{ user_id }}"]
variables = {"user_id": "123"}
protected_sql = guard(sql, "postgres", rules, variables)

Generated SQL:

WITH recent_orders AS (
    SELECT * FROM orders
    WHERE created_at > '2024-01-01'
      AND user_id = '123'  -- Injected inside CTE
)
SELECT * FROM recent_orders
UNION ALL
SELECT * FROM orders
WHERE status = 'pending'
  AND user_id = '123'  -- Injected into UNION branch

Technical Advantages

1. Security Guarantee

•Unbypassable: Injection at AST level, any SQL statement is processed
•Recursive Coverage: Automatically handle all subqueries, CTEs, UNION structures
•Type Safe: Based on AST operations, avoiding string concatenation injection risks

2. Flexibility

•Dynamic Rules: Support Jinja2 variable templates, dynamically generate based on user context
•Wildcard Support: * can match any schema or table
•Multi-Rule Combination: Support multiple rules taking effect simultaneously, auto-deduplicate

3. Maintainability

•Clean Architecture: Clear responsibility separation
•Easy Testing: Each component can be tested independently
•Extensible: Support custom operators and rule types

Performance Considerations

SQL Guard's performance overhead mainly comes from AST parsing and Scope building:

•Parsing Overhead: SQLGlot parsing speed approximately 1000-5000 SQL/sec
•Scope Building: Complex SQL (multi-layer nesting) approximately 100-500 times/sec
•
Optimization Strategies:
- •Rule caching: Same rule set can reuse parsing results
- •Lazy injection: Apply permissions only when necessary
- •Batch processing: Multiple SQL can share Scope building

Summary

The AST-based SQL permission control system is a core component of AskTable's security architecture. Through SQLGlot's powerful capabilities, we achieve:

•Fine-Grained Control: Field-level permission management
•Dynamic and Flexible: Support variable templates and wildcards
•Structural Integrity: Automatically handle all SQL structures
•Secure and Reliable: AST-level injection, cannot be bypassed

This design not only ensures data security but also provides elegant solutions for complex scenarios like multi-tenancy and row-level security.

cta.readyToSimplify

sidebar.noProgrammingNeeded
sidebar.startFreeTrial

cta.startFree cta.viewPricing

cta.noCreditCard

cta.quickStart

cta.dbSupport

sidebar.joinAskTableCommunity

AST-Based SQL Permission Control System - SQLGlot Scope Practice

AST-Based SQL Permission Control System - SQLGlot Scope Practice

Core Design Philosophy

Permission Rule Definition

Technical Architecture

Core Implementation Process

1. Rule Parsing and Matching

2. Scope Recursive Processing

3. Condition Injection

Practical Application Examples

Example 1: Multi-Tenant Data Isolation

Example 2: Complex Subquery Processing

Example 3: CTE and UNION Processing

Technical Advantages

1. Security Guarantee

2. Flexibility

3. Maintainability

Performance Considerations

Summary

Related Resources

cta.readyToSimplify