
sidebar.wechat

sidebar.feishu
sidebar.chooseYourWayToJoin

sidebar.scanToAddConsultant
In a multi-tenant data analysis system, SQL permission control is the core mechanism ensuring data security. Traditional permission control often relies on database-level views or row-level security policies, but this approach has problems like insufficient flexibility and difficulty in dynamic adjustment. AskTable adopts an innovative approach: AST-based SQL permission control system (SQL Guard), which dynamically injects permission conditions before SQL execution through SQLGlot library's Scope analysis capability.
SQL Guard's design follows the "Clean Architecture" principle, completely separating permission control logic from business logic. Its core idea is:
* wildcard matching for arbitrary schema or tablePermission rules use a SQL-like expression syntax, supporting Jinja2 variable templates:
# Example rules
rules = [
"users.orders.user_id = {{ current_user_id }}", # Users can only view their own orders
"*.*.region IN ('CN', 'US')", # Restrict region scope
"sales.*.created_at >= '2024-01-01'", # Time range restriction
]
The rule parser first converts rule strings to AST and identifies schema, table, and field information:
def parse_rule(rule: str, variables: dict, sql: str, dialect: str) -> list[Rule]:
# 1. Render Jinja2 variables
template = Template(rule)
rule_str = template.render(**variables)
# 2. Replace wildcard * with ALL
rule_str = rule_str.replace("*", "ALL")
# 3. Parse rule and SQL to AST
rule_ast = sqlglot.parse_one(rule_str, read=dialect)
sql_ast = sqlglot.parse_one(sql, read=dialect)
# 4. Extract column information from rule
rule_column = extract_column_from_rule(rule_ast)
# 5. Traverse all tables in SQL, match rules
rules = []
for table in sql_ast.find_all(exp.Table):
if matches(rule_column, table):
qualified_ast = qualify_rule_ast(rule_ast, table)
rules.append(Rule(
schema=table.db,
table=table.name,
field=rule_column.name,
ast=qualified_ast
))
return rules
SQLGlot's Scope provides powerful scope analysis capability, accurately identifying table aliases, subqueries, CTEs, and other complex structures:
def _process_scope(self, scope: Scope, rules: list[Rule]):
# 1. Build permission conditions for current scope
conditions = self._build_conditions_for_scope(scope, rules)
if conditions:
self._inject_conditions(scope, conditions)
# 2. Recursively process all sub-scopes
for subscope in self._get_all_subscopes(scope):
self._process_scope(subscope, rules)
def _get_all_subscopes(self, scope: Scope) -> list[Scope]:
return (
scope.derived_table_scopes + # FROM subqueries
scope.cte_scopes + # WITH CTE
scope.subquery_scopes + # WHERE/SELECT subqueries
scope.union_scopes + # UNION
scope.udtf_scopes # Table functions
)
Condition injection needs to handle table aliases, quotes, and merging with existing WHERE clauses:
def _inject_conditions(self, scope: Scope, conditions: list[exp.Expression]):
# 1. Find SELECT statement
select = scope.expression if isinstance(scope.expression, exp.Select) \
else scope.expression.find(exp.Select)
# 2. Merge all conditions (deduplicate)
combined = self._combine_conditions(conditions)
# 3. Inject into WHERE clause
where = select.find(exp.Where)
if where:
# Connect with existing conditions using AND
where.set("this", exp.And(this=where.this, expression=combined))
else:
# Create new WHERE clause
select.set("where", exp.Where(this=combined))
# Original SQL
sql = """
SELECT o.id, o.amount, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
"""
# Permission rules
rules = ["*.orders.tenant_id = '{{ tenant_id }}'"]
variables = {"tenant_id": "tenant_123"}
# SQL after applying permissions
protected_sql = guard(sql, "postgres", rules, variables)
Generated SQL:
SELECT o.id, o.amount, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
AND o.tenant_id = 'tenant_123' -- Auto injected
sql = """
SELECT *
FROM (
SELECT user_id, SUM(amount) as total
FROM orders
GROUP BY user_id
) subq
WHERE total > 1000
"""
rules = ["*.orders.region = 'CN'"]
protected_sql = guard(sql, "mysql", rules)
Generated SQL:
SELECT *
FROM (
SELECT user_id, SUM(amount) as total
FROM orders
WHERE region = 'CN' -- Injected inside subquery
GROUP BY user_id
) subq
WHERE total > 1000
sql = """
WITH recent_orders AS (
SELECT * FROM orders WHERE created_at > '2024-01-01'
)
SELECT * FROM recent_orders
UNION ALL
SELECT * FROM orders WHERE status = 'pending'
"""
rules = ["*.orders.user_id = {{ user_id }}"]
variables = {"user_id": "123"}
protected_sql = guard(sql, "postgres", rules, variables)
Generated SQL:
WITH recent_orders AS (
SELECT * FROM orders
WHERE created_at > '2024-01-01'
AND user_id = '123' -- Injected inside CTE
)
SELECT * FROM recent_orders
UNION ALL
SELECT * FROM orders
WHERE status = 'pending'
AND user_id = '123' -- Injected into UNION branch
* can match any schema or tableSQL Guard's performance overhead mainly comes from AST parsing and Scope building:
The AST-based SQL permission control system is a core component of AskTable's security architecture. Through SQLGlot's powerful capabilities, we achieve:
This design not only ensures data security but also provides elegant solutions for complex scenarios like multi-tenancy and row-level security.
sidebar.noProgrammingNeeded
sidebar.startFreeTrial