SQL Permission Guard: Row-Level Rules via SQLGlot AST Rewrites

In enterprise data analysis scenarios, data security is paramount. Users of different roles should only see data within their permission scope: sales can only see orders in their region, finance can only see approved reports, and management can see global data.

Traditional permission control solutions often require manually concatenating WHERE conditions in the application layer, which is error-prone and struggles to handle complex SQL queries (subqueries, CTE, UNION, etc.). AskTable's SQL Permission Guard implements transparent, secure, and high-performance row-level permission control through SQLGlot AST rewriting technology.

This article will deeply analyze the design and implementation of this permission engine.

1. Why Do We Need SQL Permission Guard?

1.1 Pain Points of Traditional Solutions

Application Layer WHERE Concatenation:

# Traditional approach: manually concatenate permission conditions
if user.role == "sales":
    sql = f"SELECT * FROM orders WHERE region = '{user.region}'"
else:
    sql = "SELECT * FROM orders"

Problems:

•❌ Easy to miss (forget to add permission conditions)
•❌ Hard to handle complex SQL (subqueries, JOIN, UNION)
•❌ SQL injection risk (string concatenation)
•❌ High maintenance cost (permission rules scattered everywhere)

Database View Solution:

-- Create view for each role
CREATE VIEW sales_orders AS
SELECT * FROM orders WHERE region = 'East';

Problems:

•❌ View explosion (roles × tables)
•❌ Cannot dynamically pass user info (like user_id)
•❌ Complex maintenance (permission changes require rebuilding views)

1.2 What Do We Need?

✅ Transparency: Permission control is transparent to AI and users, no manual handling needed ✅ Security: Automatically inject permission conditions, no omissions, no injection risk ✅ Completeness: Support all SQL syntax (subqueries, CTE, UNION, JOIN) ✅ Flexibility: Support dynamic variables (like {{user_id}}) and wildcard rules ✅ High Performance: Low AST rewriting overhead (< 10ms)

2. Technology Selection: Why SQLGlot?

2.1 SQLGlot: Swiss Army Knife for SQL

SQLGlot is a powerful SQL parser and transformer supporting 20+ SQL dialects.

Core Advantages:

•AST Parsing: Parse SQL into Abstract Syntax Tree (AST)
•Scope Analysis: Automatically identify table, column, and alias scopes
•AST Rewriting: Modify AST and regenerate SQL
•Dialect Conversion: Support MySQL, PostgreSQL, Oracle and other dialects

Why Not Regular Expressions?

•Regex cannot handle nested subqueries, CTE, complex JOIN
•Regex容易误匹配 (like comments, table names in strings)
•Regex cannot understand SQL semantics (like aliases, scopes)

2.2 Scope-based Permission Injection

SQLGlot's Scope is the key to permission injection. Scope represents a query scope in SQL (SELECT, subquery, CTE, etc.), containing:

•sources: Tables and aliases in current scope
•derived_table_scopes: Scope of derived tables (subqueries)
•cte_scopes: Scope of CTEs
•subquery_scopes: Subquery scope in WHERE/HAVING
•union_scopes: Scope of UNION

By recursively traversing the Scope tree, we can inject permission conditions at each query level.

3. Architecture Design: Clean Architecture

AskTable's SQL Permission Guard uses a clear layered architecture:

加载图表中...

Core Components

•Rule Parser: Parse permission rules, support Jinja2 variables and wildcards
•Scope Processor: Recursively process Scope tree, inject permission conditions
•Condition Builder: Build permission conditions, handle aliases and table names
•SQL Generator: Regenerate SQL from modified AST

4. Core Implementation: Deep Dive into Source Code

4.1 Permission Rule Definition

A permission rule is a simple SQL conditional expression:

# Example rules
rules = [
    "orders.region = 'East'",  # Fixed value
    "orders.user_id = {{user_id}}",  # Jinja2 variable
    "*.*.status IN ('approved', 'completed')",  # Wildcard (all tables' status field)
]

Rule Format:

•schema.table.field <operator> <value>
•Supported operators: =, !=, >, <, >=, <=, IN, LIKE, IS, NOT IN, NOT LIKE, IS NOT
•Wildcard support: * matches all schemas or tables

4.2 Rule Parsing: Jinja2 + Wildcards

def parse_rule(
    rule: str, variables: dict[str, str], sql: str, dialect: str
) -> list[Rule]:
    # 1. Render Jinja2 variables
    template = Template(rule)
    rule_str = template.render(**variables)

    # 2. Replace wildcards (* -> ALL)
    rule_str = rule_str.replace("*", "ALL")

    # 3. Parse rule AST
    rule_ast = sqlglot.parse_one(rule_str, read=dialect)
    sql_ast = sqlglot.parse_one(sql, read=dialect)

    # 4. Extract column info from rule
    rule_column = extract_column_from_rule(rule_ast)

    # 5. Traverse all tables in SQL, match rules
    rules: list[Rule] = []
    for table in sql_ast.find_all(exp.Table):
        # Check if schema and table match
        schema_matches = rule_column.db == "ALL" or rule_column.db == table.db
        table_matches = rule_column.table == "ALL" or rule_column.table == table.name

        if schema_matches and table_matches:
            # Generate rule for this table
            qualified_ast = qualify_rule_for_table(rule_ast, table)
            rules.append(Rule(
                schema=table.db,
                table=table.name,
                field=rule_column.name,
                ast=qualified_ast,
            ))

    return rules

Key Points:

•Jinja2 Rendering: Support dynamic variables (like {{user_id}})
•Wildcard Matching: *.*.* can match specified fields of all tables
•Rule Expansion: One wildcard rule expands to multiple specific rules

Example:

# Input rule
rule = "*.*.status = 'approved'"

# Input SQL
sql = "SELECT * FROM orders JOIN products ON orders.product_id = products.id"

# Parsed rules
[
    Rule(schema="public", table="orders", field="status", ast="orders.status = 'approved'"),
    Rule(schema="public", table="products", field="status", ast="products.status = 'approved'"),
]

4.3 Scope Recursive Processing: Inject Permission Conditions

class SqlPermissionGuard:
    def apply_rules(
        self, sql: str, rules: list[str], variables: dict[str, str] | None = None
    ) -> str:
        # 1. Parse SQL and build Scope tree
        ast = sqlglot.parse_one(sql, read=self.dialect).copy()
        root_scope = build_scope(ast)

        # 2. Parse all rules
        parsed_rules = list(
            itertools.chain.from_iterable(
                parse_rule(r, variables, sql, self.dialect) for r in rules
            )
        )

        # 3. Recursively process Scope tree
        self._process_scope(root_scope, parsed_rules)

        # 4. Generate new SQL
        return ast.sql(dialect=self.dialect)

    def _process_scope(self, scope: Scope, rules: list[Rule]):
        """Recursively process Scope and all its sub-Scopes"""
        # 1. Build permission conditions for current Scope
        conditions = self._build_conditions_for_scope(scope, rules)
        if conditions:
            self._inject_conditions(scope, conditions)

        # 2. Recursively process all sub-Scopes
        for subscope in self._get_all_subscopes(scope):
            self._process_scope(subscope, rules)

    def _get_all_subscopes(self, scope: Scope) -> list[Scope]:
        """Get all sub-Scopes (subqueries, CTE, UNION, etc.)"""
        return (
            scope.derived_table_scopes  # Derived tables (subqueries)
            + scope.cte_scopes  # CTE
            + scope.subquery_scopes  # Subqueries in WHERE/HAVING
            + scope.union_scopes  # UNION
            + scope.udtf_scopes  # User-defined table functions
        )

Key Points:

•Recursive Processing: Traverse all nodes of Scope tree
•Sub-Scope Types: Support subqueries, CTE, UNION, derived tables, etc.
•Independent Injection: Each Scope independently injects permission conditions

4.4 Condition Injection: WHERE Clause Rewriting

def _inject_conditions(self, scope: Scope, conditions: list[exp.Expression]):
    """Inject permission conditions into WHERE clause"""
    # 1. Find SELECT statement
    select = (
        scope.expression
        if isinstance(scope.expression, exp.Select)
        else scope.expression.find(exp.Select)
    )
    if not select:
        return

    # 2. Combine all conditions (deduplicate)
    combined = self._combine_conditions(conditions)
    if not combined:
        return

    # 3. Inject into WHERE clause
    where = select.find(exp.Where)
    if where:
        # Already has WHERE: connect with AND
        where.set("this", exp.And(this=where.this, expression=combined))
    else:
        # No WHERE: create new WHERE clause
        select.set("where", exp.Where(this=combined))

def _combine_conditions(self, conditions: list[exp.Expression]) -> exp.Expression | None:
    """Combine multiple conditions, deduplicate and connect with AND"""
    if not conditions:
        return None

    # Deduplicate (based on SQL string)
    unique_conditions = []
    seen = set()
    for cond in conditions:
        cond_str = cond.sql(dialect=self.dialect)
        if cond_str not in seen:
            seen.add(cond_str)
            unique_conditions.append(cond)

    # Connect with AND
    if len(unique_conditions) == 1:
        return unique_conditions[0]

    result = unique_conditions[0]
    for cond in unique_conditions[1:]:
        result = exp.And(this=result, expression=cond)

    return result

Key Points:

•WHERE Clause Detection: Check if WHERE clause already exists
•Condition Combining: Connect multiple conditions with AND
•Deduplication: Avoid injecting same condition multiple times

Example:

# Original SQL
sql = "SELECT * FROM orders WHERE status = 'pending'"

# Permission rules
rules = ["orders.region = 'East'"]

# Rewritten SQL
"SELECT * FROM orders WHERE status = 'pending' AND orders.region = 'East'"

4.5 Alias Handling: Correct Table Name Referencing

Tables in SQL may have aliases, permission conditions need to use correct references:

def build_conditions(
    self, rule: Rule, tables: list[tuple[str | None, exp.Table]]
) -> list[exp.Expression]:
    """Build permission conditions for matched tables"""
    conditions = []

    for alias, table in tables:
        condition = rule.ast.copy()
        column = condition.find(exp.Column)

        if column and column.table:
            if alias:
                # Has alias: use alias
                column.set("table", exp.to_identifier(alias))
                column.set("db", None)
            else:
                # No alias: use full table name
                column.set("table", exp.to_identifier(table.this.this))
                column.set("db", exp.to_identifier(table.db))

            column.set("catalog", None)

        conditions.append(condition)

    return conditions

Example:

# Original SQL (has alias)
sql = "SELECT * FROM orders o WHERE o.status = 'pending'"

# Permission rules
rules = ["orders.region = 'East'"]

# Rewritten SQL (uses alias o)
"SELECT * FROM orders o WHERE o.status = 'pending' AND o.region = 'East'"

5. Complex Scenario Handling

5.1 Subqueries

# Original SQL
sql = """
SELECT * FROM (
    SELECT * FROM orders WHERE status = 'pending'
) AS pending_orders
"""

# Permission rules
rules = ["orders.region = 'East'"]

# Rewritten SQL (subquery internals also injected)
"""
SELECT * FROM (
    SELECT * FROM orders WHERE status = 'pending' AND orders.region = 'East'
) AS pending_orders
"""

Key Points:

•Subqueries have independent Scope
•Recursive processing ensures all levels are injected

5.2 CTE (Common Table Expression)

# Original SQL
sql = """
WITH pending_orders AS (
    SELECT * FROM orders WHERE status = 'pending'
)
SELECT * FROM pending_orders
"""

# Permission rules
rules = ["orders.region = 'East'"]

# Rewritten SQL
"""
WITH pending_orders AS (
    SELECT * FROM orders WHERE status = 'pending' AND orders.region = 'East'
)
SELECT * FROM pending_orders
"""

5.3 UNION

# Original SQL
sql = """
SELECT * FROM orders WHERE status = 'pending'
UNION
SELECT * FROM orders WHERE status = 'approved'
"""

# Permission rules
rules = ["orders.region = 'East'"]

# Rewritten SQL (both SELECTs injected)
"""
SELECT * FROM orders WHERE status = 'pending' AND orders.region = 'East'
UNION
SELECT * FROM orders WHERE status = 'approved' AND orders.region = 'East'
"""

5.4 JOIN

# Original SQL
sql = """
SELECT o.*, p.name
FROM orders o
JOIN products p ON o.product_id = p.id
WHERE o.status = 'pending'
"""

# Permission rules
rules = [
    "orders.region = 'East'",
    "products.category = 'Electronics'"
]

# Rewritten SQL (both tables injected)
"""
SELECT o.*, p.name
FROM orders o
JOIN products p ON o.product_id = p.id
WHERE o.status = 'pending' AND o.region = 'East' AND p.category = 'Electronics'
"""

6. Performance Optimization and Security Assurance

6.1 Performance Optimization

AST Rewriting Overhead:

•Parse SQL: ~5ms
•Build Scope: ~2ms
•Inject conditions: ~1ms
•Generate SQL: ~2ms
•Total: ~10ms (for complex SQL)

Optimization Strategies:

•Rule Cache: AST of same rules can be reused
•Scope Reuse: Scope tree of same SQL can be reused
•Condition Deduplication: Avoid injecting same conditions repeatedly

6.2 Security Assurance

SQL Injection Prevention:

•✅ Use AST rewriting, no string concatenation
•✅ Jinja2 variables auto-escaped
•✅ Reject when rule parsing fails

Permission Omission Prevention:

•✅ Recursively process all Scopes, no omissions
•✅ Wildcard rules automatically match all tables
•✅ Throw exception when rule parsing fails

Compatibility Assurance:

•✅ Support MySQL, PostgreSQL, Oracle, SQL Server, DuckDB
•✅ Automatically handle dialect differences (like quotes, case sensitivity)
•✅ Preserve original SQL formatting and comments

7. Practical Cases

Case 1: Sales Region Permission

# User info
user = {"role": "sales", "region": "East"}

# Permission rules
rules = ["orders.region = '{{region}}'"]

# Original SQL (generated by AI)
sql = "SELECT SUM(amount) FROM orders WHERE status = 'completed'"

# Apply permissions
filtered_sql = guard(
    sql=sql,
    sql_dialect="mysql",
    rules=rules,
    variables={"region": user["region"]}
)

# Result
"SELECT SUM(amount) FROM orders WHERE status = 'completed' AND orders.region = 'East'"

Case 2: Multi-table Wildcard Rules

# Permission rules (all tables' deleted field must be 0)
rules = ["*.*.deleted = 0"]

# Original SQL
sql = """
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
"""

# Apply permissions
filtered_sql = guard(sql=sql, sql_dialect="postgres", rules=rules)

# Result (both tables injected)
"""
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending' AND o.deleted = 0 AND c.deleted = 0
"""

Case 3: Complex Subqueries

# Permission rules
rules = ["orders.user_id = {{user_id}}"]

# Original SQL (contains subquery and CTE)
sql = """
WITH monthly_sales AS (
    SELECT DATE_TRUNC('month', order_date) AS month, SUM(amount) AS total
    FROM orders
    WHERE status = 'completed'
    GROUP BY month
)
SELECT * FROM monthly_sales
WHERE total > (
    SELECT AVG(total) FROM monthly_sales
)
"""

# Apply permissions
filtered_sql = guard(
    sql=sql,
    sql_dialect="postgres",
    rules=rules,
    variables={"user_id": "12345"}
)

# Result (CTE internals injected)
"""
WITH monthly_sales AS (
    SELECT DATE_TRUNC('month', order_date) AS month, SUM(amount) AS total
    FROM orders
    WHERE status = 'completed' AND orders.user_id = '12345'
    GROUP BY month
)
SELECT * FROM monthly_sales
WHERE total > (
    SELECT AVG(total) FROM monthly_sales
)
"""

8. Comparison with Other Solutions

Solution	Transparency	Completeness	Security	Performance	Maintenance Cost
AST Rewriting	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Application Layer Concatenation	⭐⭐	⭐⭐	⭐⭐	⭐⭐⭐⭐⭐	⭐⭐
Database Views	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐
ORM Filters	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Row-Level Security	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐

Why AskTable chose AST rewriting:

•Completely transparent (AI doesn't need to perceive permissions)
•Supports all SQL syntax
•No SQL injection risk
•Cross-database compatibility
•Low maintenance cost (rules centrally managed)

9. Practical Experience and Pitfalls

9.1 Dialect Difference Handling

Problem: Different databases have different identifier quoting rules.

Solution:

•MySQL: Backticks `table`
•PostgreSQL: Double quotes "table"
•Oracle: Double quotes "TABLE" (uppercase)

# Decide whether to quote based on dialect
should_quote = self.dialect != "oracle"
column.set("table", exp.to_identifier(name=table_name, quoted=should_quote))

9.2 Alias Recognition

Problem: How to distinguish real aliases from table names?

-- Real alias
SELECT * FROM orders o

-- Table name (no alias)
SELECT * FROM orders

Solution:

has_real_alias = source.alias and source.alias != source.name

9.3 Condition Deduplication

Problem: Same table appears multiple times in SQL, will inject conditions repeatedly.

Solution:

•Deduplicate based on SQL string
•Use set to record already-injected conditions

9.4 Fallback Solution

Problem: What to do when AST rewriting fails?

Solution:

•Catch exception, log it
•Fallback to LLM rewriting (let AI understand permission rules and rewrite SQL)
•Ensure system availability

try:
    filtered_sql = guard(sql, dialect, rules, variables)
except Exception:
    # Fallback to LLM rewriting
    filtered_sql = await llm_rewrite_sql(sql, rules)

10. Summary and Outlook

AskTable's SQL Permission Guard implements:

✅ Transparency: AI and users don't need to perceive permission control ✅ Completeness: Support all SQL syntax (subqueries, CTE, UNION, JOIN) ✅ Security: No SQL injection risk, no permission omissions ✅ Flexibility: Support dynamic variables and wildcard rules ✅ High Performance: AST rewriting overhead < 10ms

Future Optimization Directions

•Column-level Permissions: Support field-level permission control (like masking, hiding)
•Permission Cache: Cache rule AST to improve performance
•Permission Audit: Record all permission injection operations for auditing
•Visual Debugging: Provide AST visualization tool for easier rule debugging

Open Source Plan

We plan to open source the SQL Permission Guard to help more teams build secure data analysis systems. Stay tuned!

Related Reading:

Technical Exchange:

•Website: asktable.com

cta.readyToSimplify

sidebar.noProgrammingNeeded
sidebar.startFreeTrial

cta.startFree cta.viewPricing

cta.noCreditCard

cta.quickStart

cta.dbSupport

sidebar.joinAskTableCommunity

SQL Permission Guard: Row-Level Rules via SQLGlot AST Rewrites

1. Why Do We Need SQL Permission Guard?

1.1 Pain Points of Traditional Solutions

1.2 What Do We Need?

2. Technology Selection: Why SQLGlot?

2.1 SQLGlot: Swiss Army Knife for SQL

2.2 Scope-based Permission Injection

3. Architecture Design: Clean Architecture

Core Components

4. Core Implementation: Deep Dive into Source Code

4.1 Permission Rule Definition

4.2 Rule Parsing: Jinja2 + Wildcards

4.3 Scope Recursive Processing: Inject Permission Conditions

4.4 Condition Injection: WHERE Clause Rewriting

4.5 Alias Handling: Correct Table Name Referencing

5. Complex Scenario Handling

5.1 Subqueries

5.2 CTE (Common Table Expression)

5.3 UNION

5.4 JOIN

6. Performance Optimization and Security Assurance

6.1 Performance Optimization

6.2 Security Assurance

7. Practical Cases

Case 1: Sales Region Permission

Case 2: Multi-table Wildcard Rules

Case 3: Complex Subqueries

8. Comparison with Other Solutions

9. Practical Experience and Pitfalls

9.1 Dialect Difference Handling

9.2 Alias Recognition

9.3 Condition Deduplication

9.4 Fallback Solution

10. Summary and Outlook

Future Optimization Directions

Open Source Plan

cta.readyToSimplify