API Reference
Core pipeline
weirding
weirding public API: the XML <-> JSON Schema IR <-> Pydantic triangle.
Exports compile, from_schema, define_model, parse, to_xml, to_schema, and dump_xml — the full edge set of the triangle (ADR-0012).
compile(xml)
Convert an XML schema document to a JSON Schema IR dict.
The JSON Schema dict is the core product of weirding. It can be consumed directly (Databricks StructType, jsonschema.validate, OpenAPI specs) or passed to from_schema() to build a typed DTO.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml
|
str | bytes
|
XML schema document using the plain-attribute annotation convention or XSD. XSD requires the weirding[xsd] extra. |
required |
Returns:
| Type | Description |
|---|---|
JsonSchemaIR
|
JSON Schema-compatible dict (draft 2020-12 subset). Array fields include |
JsonSchemaIR
|
an x-weirding-item-tag extension key naming the child element tag. |
Raises:
| Type | Description |
|---|---|
SchemaError
|
schema document is structurally invalid. |
UnsupportedDialectError
|
dialect cannot be detected or is unsupported. |
ParseError
|
xml is not well-formed. |
Source code in src/weirding/__init__.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
define_model(xml, *, builder=None)
Compile XML then build a typed DTO, naming the type from the root element tag.
Equivalent to
schema = compile(xml)
return from_schema(schema, name=
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml
|
str | bytes
|
XML schema document (plain-attribute convention or XSD). |
required |
builder
|
DTOBuilder | None
|
DTOBuilder implementation. Defaults to PydanticBuilder(). |
None
|
Returns:
| Type | Description |
|---|---|
type
|
A new type produced by the builder. |
Source code in src/weirding/__init__.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 | |
from_schema(schema, *, name='Model', builder=None)
from_schema(schema: JsonSchemaIR, *, name: str = ...) -> type[BaseModel]
from_schema(schema: JsonSchemaIR, *, name: str = ..., builder: None) -> type[BaseModel]
from_schema(schema: JsonSchemaIR, *, name: str = ..., builder: DTOBuilder) -> type
Build a typed DTO class from a JSON Schema IR dict.
With the default builder (PydanticBuilder), returns a Pydantic v2 BaseModel subclass. Pass a custom DTOBuilder to produce TypedDicts, dataclasses, Spark StructType wrappers, or any other typed container.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
schema
|
JsonSchemaIR
|
JSON Schema dict as produced by compile() or any compatible source. |
required |
name
|
str
|
Class name for the generated type. Must be a valid Python identifier. |
'Model'
|
builder
|
DTOBuilder | None
|
DTOBuilder implementation. Defaults to PydanticBuilder(). |
None
|
Returns:
| Type | Description |
|---|---|
type
|
A new type produced by the builder. Each call with a distinct schema produces |
type
|
a distinct class; cache the result if calling in a hot path. |
Raises:
| Type | Description |
|---|---|
SchemaError
|
schema cannot be converted to a valid typed class. |
Source code in src/weirding/__init__.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
parse(xml, model)
Validate and bind LLM-produced XML against a compiled model.
Deserializes the XML element tree into a dict, then calls model.model_validate(dict) to produce a typed instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xml
|
str | bytes
|
XML data document (LLM output). Must be well-formed. |
required |
model
|
type[Validatable]
|
Any type satisfying the Validatable protocol — in practice a type produced by define_model() or from_schema(). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
A validated instance of model. |
Raises:
| Type | Description |
|---|---|
ParseError
|
xml is malformed or fails model validation. |
Source code in src/weirding/__init__.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |
to_xml(instance)
Serialize a Pydantic model instance to an XML string.
Element tags map to field names, scalars become text content, objects become child elements, and lists are serialized as repeated child elements using the tag name from the original schema (x-weirding-item-tag).
Round-trip guarantee: parse(to_xml(x), type(x)) == x for any instance x produced by parse().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instance
|
BaseModel
|
Any Pydantic BaseModel instance, including dynamically generated ones. |
required |
Returns:
| Type | Description |
|---|---|
str
|
UTF-8 XML string without an XML declaration. |
Source code in src/weirding/__init__.py
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | |
to_schema(model)
Derive a JSON Schema IR dict from a Pydantic v2 model class.
The reverse edge C → B — the inverse of from_schema() (ADR-0012). Rather than reimplementing type extraction, it normalizes model.model_json_schema() into weirding's IR, so it tracks Pydantic's own type → schema logic across releases.
Array properties that lack the x-weirding-item-tag extension key (hand-written models not built by weirding) receive a synthesized tag via the same singularization fallback used by to_xml() (tags → tag, else item), keyed on the property name. Models produced by from_schema() already carry the tag and are passed through unchanged. The required list is restored as [] on any object node where Pydantic omitted it, keeping the IR symmetric with compile().
$defs / $ref produced by Pydantic for nested models are left intact; resolving them is dump_xml()'s concern, not this one. Consequently to_schema(from_schema(ir)) equals ir only modulo $ref-inlining and the additive title / default keys Pydantic injects — exact dict equality is not guaranteed.
The model is never mutated; the function operates on a deep copy of Pydantic's schema output and returns a new dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
type[BaseModel]
|
A Pydantic v2 BaseModel subclass. |
required |
Returns:
| Type | Description |
|---|---|
JsonSchemaIR
|
A new JSON Schema IR dict. May contain $defs / $ref when the model has |
JsonSchemaIR
|
nested object fields. |
Raises:
| Type | Description |
|---|---|
SchemaError
|
The model's schema contains prefixItems (a tuple field), which is unrepresentable in the IR (ADR-0004, MEMORY rule 11). |
Source code in src/weirding/__init__.py
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 | |
dump_xml(ir)
Serialize a JSON Schema IR dict to a canonical XML schema document.
The reverse edge B → A — the inverse of compile() (ADR-0012). It emits a canonical ADR-0001 plain-attribute annotation XML schema document from the public JsonSchemaIR dict.
Distinct from to_xml(): dump_xml(ir) serializes a JSON Schema IR into an XML schema document — the authoring format, the inverse of compile(). to_xml(instance) serializes a model instance into XML data — the inverse of parse(). One produces a schema you could re-author; the other produces a data payload.
The composition C → A (Pydantic model → XML schema document) is exactly dump_xml(to_schema(model)); no third function exists for it (ADR-0012).
Any local $ref / $defs are inlined first; the root element name is taken from ir['title'] (falling back to "Model"). The two nullable input shapes anyOf:[T, null] and type:[T, "null"] both map to nullable="true". format and additionalProperties are dropped (no annotation-convention equivalent); const is rejected.
The function is pure: it deep-copies its input, never mutates it, and performs no I/O, logging, or network access.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ir
|
JsonSchemaIR
|
A JSON Schema IR dict, as produced by compile() or any compatible source. Must be acyclic and free of non-null unions. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A pretty-printed XML schema document using the ADR-0001 plain-attribute |
str
|
annotation convention. |
Raises:
| Type | Description |
|---|---|
SchemaError
|
The IR contains a construct with no annotation-convention representation — a non-null anyOf / oneOf / allOf, a const, or a cyclic / non-local / unresolvable $ref. The message names the offending construct. |
Source code in src/weirding/__init__.py
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 | |
Prompt utilities
weirding.prompt
Prompt engineering utilities for LLM structured-output retry workflows.
RetryContext
Stateful context for an LLM structured-output retry loop.
Tracks attempt count and accumulated errors so that format_error() can produce increasingly specific retry prompts without the caller managing state.
Source code in src/weirding/prompt.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | |
attempt
property
Return the current attempt number (0-based before first error).
exceeded
property
Return True if the maximum number of retry attempts has been reached.
__init__(model, max_attempts=3)
Initialize with the given model and an optional max-attempts limit.
Source code in src/weirding/prompt.py
138 139 140 141 142 143 | |
record_error(error)
Record a validation error and increment the attempt counter.
Source code in src/weirding/prompt.py
155 156 157 158 | |
retry_message()
Return the formatted error message from the last recorded error.
Source code in src/weirding/prompt.py
160 161 162 | |
to_template(model)
Generate an XML prompt template from a compiled Pydantic model.
Produces an XML document showing the expected element structure and
field types, suitable for inclusion in an LLM system prompt. Scalar
fields are rendered as
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
type[BaseModel]
|
A type[BaseModel] produced by define_model() or from_schema(). |
required |
Returns:
| Type | Description |
|---|---|
str
|
XML string showing the expected output format. |
Source code in src/weirding/prompt.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |
format_error(error, *, model)
Format a validation error into a natural-language retry instruction.
Converts pydantic.ValidationError (or weirding.ParseError wrapping one) into a human-readable description of what the LLM got wrong, suitable for appending to a retry prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
error
|
Exception
|
The exception raised by parse() on the failed attempt. |
required |
model
|
type[BaseModel]
|
The model that was being validated against. Used to provide field-level context in the error message. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Plain text description of the validation failures. |
Source code in src/weirding/prompt.py
208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | |
Protocols and types
weirding
weirding public API: the XML <-> JSON Schema IR <-> Pydantic triangle.
Exports compile, from_schema, define_model, parse, to_xml, to_schema, and dump_xml — the full edge set of the triangle (ADR-0012).
JsonSchemaIR = dict[str, Any]
module-attribute
DTOBuilder
Bases: Protocol
Any type that can build a typed DTO class from a JSON Schema IR dict.
The default implementation is PydanticBuilder, which produces Pydantic v2 BaseModel subclasses. Custom implementations can produce TypedDicts, dataclasses, Spark StructType wrappers, or any other typed container.
Symmetric with Validatable: both ends of the pipeline are backend-neutral. Pydantic is the default, not the requirement.
Source code in src/weirding/__init__.py
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | |
build(schema, *, name)
Build and return a DTO class from the given JSON Schema IR.
Source code in src/weirding/__init__.py
87 88 89 | |
PydanticBuilder
Default DTOBuilder — produces Pydantic v2 BaseModel subclasses.
json-schema-to-pydantic. Patches applied:
- schema["additionalProperties"] == False → model_config extra="forbid"
- prefixItems must never appear in schema (enforced by _schema.py)
Source code in src/weirding/__init__.py
92 93 94 95 96 97 98 99 100 101 102 103 104 | |
build(schema, *, name='Model')
Build a Pydantic BaseModel subclass from the given JSON Schema IR.
Source code in src/weirding/__init__.py
100 101 102 103 104 | |
Validatable
Bases: Protocol
Any type that can validate a dict into a typed instance.
Satisfied by every Pydantic BaseModel subclass. Accepting a Protocol instead of BaseModel directly keeps parse() open to non-Pydantic backends (TypeAdapter wrappers, Spark schema validators, etc.) without an API change.
Source code in src/weirding/__init__.py
60 61 62 63 64 65 66 67 68 69 70 71 72 | |
model_validate(obj)
classmethod
Validate data dict and return a model instance.
Source code in src/weirding/__init__.py
69 70 71 72 | |
Exceptions
weirding
weirding public API: the XML <-> JSON Schema IR <-> Pydantic triangle.
Exports compile, from_schema, define_model, parse, to_xml, to_schema, and dump_xml — the full edge set of the triangle (ADR-0012).
WeirdingError
Bases: Exception
Source code in src/weirding/_exceptions.py
1 2 | |
SchemaError
Bases: WeirdingError
Source code in src/weirding/_exceptions.py
13 14 | |
ParseError
Bases: WeirdingError
Source code in src/weirding/_exceptions.py
9 10 | |
UnsupportedDialectError
Bases: WeirdingError
Source code in src/weirding/_exceptions.py
5 6 | |