AST Method Matrix
This page documents how the typedframes static checker handles each DataFrame operation.
Operations fall into three categories: schema-modifying (the checker updates its
internal column model), row-passthrough (the checker assumes the schema is unchanged),
and untracked (the variable is dropped from tracking to avoid false positives).
Schema-Modifying Operations
The checker updates its column model when it sees these operations, so subsequent accesses are validated against the new schema.
| Operation | Effect on schema | Example |
|---|---|---|
df["col"] = val |
Adds "col" to the schema |
df["score"] = df["value"] * 2 |
del df["col"] |
Removes "col" from the schema |
del df["temp"] |
df.drop(columns=[…]) |
Removes listed columns | df.drop(columns=["a", "b"]) |
df.drop([…]) |
Removes listed columns (positional) | df.drop(["a", "b"]) |
df.assign(col=…) |
Adds new column(s) to the schema | df.assign(full_name=…) |
df.rename(columns={…}) |
Renames columns in the schema | df.rename(columns={"a": "b"}) |
df.select([…]) |
Narrows schema to selected columns | df.select(["id", "name"]) |
df.select(pl.col("…")) |
Narrows schema to the named column | df.select(pl.col("id")) |
df.pop("col") |
Removes "col" from the schema |
df.pop("score") |
df.insert(pos, "col", val) |
Adds "col" to the schema |
df.insert(0, "rank", …) |
df[["c1", "c2"]] |
Narrows schema to selected columns | subset = df[["id", "name"]] |
pd.merge(left, right, …) |
Merges both schemas | merged = pd.merge(a, b, on="id") |
pd.concat([df1, df2], …) |
Unions both schemas | combined = pd.concat([a, b]) |
Row-Passthrough Operations
The checker leaves the schema unchanged for these operations — the output variable inherits the same column model as the input.
| Operation | Notes |
|---|---|
df.filter(…) |
Row filter; columns unchanged |
df.query(…) |
pandas query string; columns unchanged |
df.head(n) |
First n rows; columns unchanged |
df.tail(n) |
Last n rows; columns unchanged |
df.sample(…) |
Random sample; columns unchanged |
df.sort_values(…) |
Row sort; columns unchanged |
df.sort(…) |
polars row sort; columns unchanged |
df.reset_index(…) |
Index reset; columns unchanged |
df.nlargest(n, col) |
Top n rows; columns unchanged |
df.nsmallest(n, col) |
Bottom n rows; columns unchanged |
df.fillna(…) |
Fill NaN values; columns unchanged |
df.dropna(…) |
Drop NaN rows; columns unchanged |
df.ffill() / df.bfill() |
Forward/back fill; columns unchanged |
Untracked Operations
For these operations the result variable is not tracked — the checker won't report false positives on it, but it also won't validate column references against it.
These operations require runtime information (joined keys, pivot categories, melt id-vars, explosion depth, etc.) that is not available to a static AST pass. Tracking them correctly would require evaluating expressions at compile time, which is out of scope for a static checker.
| Operation | Why untracked |
|---|---|
df.join(other, …) |
Output schema depends on join keys and how= parameter |
df.merge(other, …) |
(pandas instance method) Same as join |
df.pivot(…) |
Output columns are derived from cell values at runtime |
df.pivot_table(…) |
Same as pivot |
df.melt(…) |
Converts columns to rows; output schema varies by id_vars |
df.explode(col) |
Schema depends on list column depth |
pd.get_dummies(df, …) |
Columns come from categorical values, unknown at lint time |
df.stack(…) |
Pivots column level to row index |
df.unstack(…) |
Pivots row index to column level |
df.apply(fn, …) |
Output depends on the return type of fn |
df.map(fn, …) |
Output depends on fn |
df.transform(fn, …) |
Output depends on fn |
df.groupby(…).agg(…) |
Output columns are determined by aggregation spec |
df.with_columns(…) |
polars column addition/mutation; schema not narrowed statically |
Error Code Reference
| Code | Severity | Message | Default |
|---|---|---|---|
unknown-column |
Error | Column '<name>' not found in <Schema>. Did you mean '<suggestion>'? |
Always reported |
reserved-name |
Error | Renamed-from column '<name>' not found in <Schema> |
Always reported |
untracked-dataframe |
Warning | Columns unknown at lint time — annotate with a schema to enable column checking | Off by default |
dropped-unknown-column |
Warning | Dropped column '<name>' does not exist in <Schema> |
Off by default |
untracked-dataframe is suppressed unless --strict-ingest is passed to the CLI. This keeps the
checker quiet on exploratory scripts that load data without a schema annotation.
unknown-column reports the closest column name as a typo suggestion when the edit distance is small (≤ 2 characters), which helps catch common capitalization and spelling mistakes.