Dynamic Enums
My go-to choice for validating JSON files is pydantic
.
It’s fast, reliable, and relatively flexible.
A feature I really appreciate is the ability to validate a field value against an Enum
.
Here’s the example from the pydantic
v2.10 Enum
docs:
from enum import Enum, IntEnum
from pydantic import BaseModel, ValidationError
class FruitEnum(str, Enum):
pear = "pear"
banana = "banana"
class ToolEnum(IntEnum):
spanner = 1
wrench = 2
class CookingModel(BaseModel):
fruit: FruitEnum = FruitEnum.pear
tool: ToolEnum = ToolEnum.spanner
print(CookingModel())
#> fruit=<FruitEnum.pear: 'pear'> tool=<ToolEnum.spanner: 1>
print(CookingModel(tool=2, fruit="banana"))
#> fruit=<FruitEnum.banana: 'banana'> tool=<ToolEnum.wrench: 2>
try:
CookingModel(fruit="other")
except ValidationError as e:
print(e)
"""
1 validation error for CookingModel
fruit
Input should be 'pear' or 'banana' [type=enum, input_value='other', input_type=str]
"""
[!NOTE]
mypy
, another tool I’ve grown to love, would raise[arg-type]
errors on these lines:... print(CookingModel(tool=2, fruit="banana")) # [arg-type] error ... CookingModel(fruit="other") # [arg-type] error
The Problem
Defining an enum
is great when you only have a handful of valid values,
but what if you have more?
And I don’t mean like 10 or 20—I mean something closer to 50.
Typing out all the values into the enum
class is a valid approach, but that can lead to mistakes (and I’m lazy).
A much better approach would be to have someone else type out all the values. 😂
No, really. I’m serious. My job isn’t to come up with the values in the JSON files, but to validate them. Which means someone else has already come up with a list of 50 valid values. And lucky for me, I have that list in a YAML file.
The Solution
Before I can construct an enum
out of the values in the YAML file, I’ll need to extract them.
I can do this with the PyYAML
package.
# The yaml file contents.
us_states: [AL, AK, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME, MD, MA, MI, MN, MS, MO, MT, NE, NV, NH, NJ, NM, NY, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VT, VA, WA, WV, WI, WY]
"""Contents of main.py"""
from yaml import load, Loader
def _get_valid_values() -> list[str]:
"""Read the yaml file and return a list of valid values."""
with open("valid_values.yaml", mode="rb") as yml_file:
data = load(stream=yml_file, Loader=Loader)
valid_values: list[str] = data["us_states"]
return valid_values
if __name__ == "__main__":
valid_values = _get_valid_values()
print(len(valid_values)) # 50
Once we have the values as a Python
object,
we need to subclass enum.Enum
and assign them.
But how?
On first attempt, maybe we call the _get_valid_values
function within our new enum
class:
"""Contents of main.py"""
from enum import Enum
from yaml import load, Loader
def _get_valid_values() -> list[str]: ...
class State(str, Enum):
"""Enumerations of valid values."""
_get_valid_values()
if __name__ == "__main__":
try:
print(repr(State(value="MO")))
except Exception as exc:
print(repr(exc))
But this raises the following TypeError
:
TypeError(“<enum ‘State’> has no members; specify
names=()
if you meant to create a new, empty, enum”)
No, we can’t just call _get_valid_values
and expect the values to be assigned in the State
enum.
Or can we?
"""Contents of main.py"""
from enum import Enum
from yaml import load, Loader
def _get_valid_values() -> list[str]: ...
class State(str, Enum):
"""Enumerations of valid values."""
# https://docs.python.org/3/library/enum.html#enum.Enum._ignore_
_ignore_ = ["State", "value"]
State = vars()
for value in _get_valid_values():
State[value] = value
if __name__ == "__main__":
try:
print(repr(State(value="MO"))) # <State.MO: 'MO'>
except Exception as exc:
print(repr(exc))
It turns out
that you can assign the values within the enum
by using the private _ignore_
value.
From the docs:
_ignore_
is a list of names that will not become members, and whose names will also be removed from the completed enumeration. See TimePeriod for an example.
To be more specific,
assigning _ignore_
to ["State", "value"]
means we can use State
and value
as variables within the class’s logic.
Not ignoring “value” would result in a TypeError
saying we’ve already assigned value
,
and not ignoring “State” would assign State
as a valid value (which we don’t want).
From here we can validate our JSON by adding State
to a pydantic
model.
No more manually updating our State
enum
.
"""Contents of main.py"""
from enum import Enum
import json
from pydantic.main import BaseModel
from yaml import load, Loader
def _get_valid_values() -> list[str]: ...
class State(str, Enum): ...
class Location(BaseModel):
state: State
if __name__ == "__main__":
some_json_string = '{"state": "MO"}'
data = json.loads(s=some_json_string)
try:
print(repr(Location(**data))) # Location(state=<State.MO: 'MO'>)
except Exception as exc:
print(repr(exc))
Bonus
While poking around the web for other ways to dynamically create enum
s,
I found this post on dev.to
"""Contents of main.py"""
from enum import Enum
import json
from pydantic.main import BaseModel
from yaml import load, Loader
def _get_valid_values() -> list[str]: ...
State = Enum(value="State", names={v: v for v in _get_valid_values()})
class Location(BaseModel):
state: State
if __name__ == "__main__":
some_json_string = '{"state": "MO"}'
data = json.loads(s=some_json_string)
try:
print(repr(Location(**data))) # Location(state=<State.MO: 'MO'>)
except Exception as exc:
print(repr(exc))
I like this solution because of its simplicity, but mypy
raises a [misc]
error saying:
Second argument of Enum() must be string, tuple, list or dict literal for mypy to determine Enum members
Because of this, I will be sticking with my inheritance approach.