Pydantic - Field function and Model Config
In this post, we'll dive deeper into Pydantic's features and learn how to customize fields using the Field()
function. We can use this to set default values, to include/exclude fields from exported model outputs, to set aliases, and to customize the model's JSON schema output.
We'll also learn about model Config
classes, which can be used to customize model-wide behavior.
This post follows from the previous post, and we will use truncated versions of the previous Pydantic models in this post, focusing only on specific fields for certain sections.
The source data for this post can be found here.
The associated video for this post is here:
Objectives
In this post, we will learn:
- How to set default values with the
Field()
function - How to use aliases to allow model fields to have different names than the fields in the source data
- How to include/exclude fields when exporting models, using both the
Field()
function and model export functions. - How to add titles and descriptions for fields in JSON Schema outputs, using the
Field()
function. - How to defined model
Config
classes to set model-wide configuration.
Exploring the Field() function
We've seen how to define Pydantic fields using types such as int
, float
, and date
, as well as how to define optional/nullable fields and how to define union fields (where the type may be one of multiple values). We've also seen how to define constrained fields.
Pydantic offers an additional mechanism that can be used to define field information and validations - the Field()
function.
In this section, we'll explore some of the things that can be done with this function.
Let's start with an example of how to use the Field()
function to define a default value for a field.
We've seen how to do this before - with our Student
model, we had a field called modules
which was a list of Module
objects, as below:
class Student(BaseModel):
modules: list[Module] = []
The default value here is an empty list.
However, if you use the Field()
function, the assignment will replace the default value (the empty list above), as we'll see. We can use the default
keyword argument to the Field()
function to set a default:
class Student(BaseModel):
modules: list[Module] = Field(default=[])
This allows us to set the default value. On its own, this offers no benefit over the previous assignment of an empty list, but it is important to know how to set defaults with the Field()
function if you're using it for other purposes, as we'll see in the remainder of this post.
One benefit is that we can use a similar keyword-argument called default_factory
to set a field's default value to a dynamic value. For example, setting a date field to the current date, or setting a UUID field to a dynamically created UUID.
We can define a default factory for our student's date_of_birth
field, and set the date of birth to the current date if none is provided in the source data. Note: logically this does not make sense, but let's roll with it and add it below using the default_factory
keyword-argument:
class Student(BaseModel):
date_of_birth: date = Field(default_factory=lambda: datetime.today().date())
The factory function we define here is a lambda function that takes no arguments, and returns the current date. You can define your own function to implement whatever logic you'd like for this default factory - it should return a suitable value that will serve as the default for the field!
Let's move on. So far, we've had Pydantic models whose field names matched the field names in the source data. For example, our class has a date_of_birth
field, and that field (with the same name) also exists in the source data here.
But what happens if the name we want to give our field does not match the API/external data? Often, we do not want to use the same name.
Let's look at the student's name. In our class, we've created a name
field of type string, as below:
class Student(BaseModel):
name: str
This matches the key name in the source data here.
However, let's say we want to call this field student_name
on our Pydantic model. By default, Pydantic will try and match based on the name of the field - it should also exist in the incoming data, with the same name. However, you can provide an alias, using the Field()
function, as below:
class Student(BaseModel):
student_name: str = Field(alias="name")
So here, our field name is student_name
on the model, and we use Field(alias="name")
to inform Pydantic that the name of the field in the data source is name
.
So this will take the value of name
in the data, and store it in the model's student_name
field, whilst also performing any validations and data conversions that you define.
This is very handy if you need to map fields in the data you're working with to fields with different names in your Pydantic model.
The Field()
function can also be used to define certain validation constraints, such as enforcing a number to be greater than a specific value. We can see a full list of options here in the Pydantic documentation.
Exporting Models - Advanced Usage
We saw in the first post how we can use the .dict()
and .json()
functions to export a model to a Python dict or a JSON string. By default, this will dump the entire object, i.e. all of its fields and values.
We can control the behavior of this functionality using the Field()
function.
There are two additional keyword arguments to Field()
, both of which default to False
:
include
- a boolean indicating that *only* this field should be included when callingmodel.dict()
ormodel.json()
exclude
- a boolean indicating that the field should be excluded when callingmodel.dict()
ormodel.json()
Let's say we want to exclude the list of modules from the resulting dictionary or JSON. We can add the exclude=True
keyword argument to the field, as below:
class Student(BaseModel):
modules: list[Module] = Field(default=[], exclude=True)
We can then convert the first model from our retrieved data to a dictionary, as below:
model = Student(**data[0])
print(model.dict())
This gives the following output - note that the module-list has been excluded!
{'GPA': 3.0,
'course': 'Computer Science',
'date_of_birth': datetime.date(1995, 5, 25),
'department': 'Science and Engineering',
'fees_paid': False,
'id': UUID('d15782d9-3d8f-4624-a88b-c8e836569df8'),
'student_name': 'Eric Travis'}
We can add as many exclusions as we want by defining the Field()
function on the relevant fields. For example, we might want to exclude the UUID, as it may be for internal use only.
class Student(BaseModel):
id: uuid.UUID = Field(exclude=True)
modules: list[Module] = Field(default=[], exclude=True)
The UUID would now also be excluded from the dictionary.
This way to exclude a field is useful for security-sensitive fields such as passwords, API keys, etc. However, when flexibly dumping data, you might not want to have to write Field()
functions for each field.
Pydantic provides another way to exclude/include fields by passing the same keyword-arguments to the .dict()
and .json()
functions.
We could have achieved the above with the following code:
model = Student(**data[0])
print(model.dict(exclude={'id', 'modules'}))
We pass a set of the keys we want to exclude from the resulting dictionary. This also works with nested objects too - for example, if we want to exclude the id
, but also exclude the registration_code
from all the modules in the list, we could write the following code:
model = Student(**data[0])
exclude = {
'id': True,
'modules': {'__all__': {'registration_code'}}
}
print(model.dict(exclude=exclude))
This time, we define a dictionary for the fields we want to exclude. For nested objects, we define a set of fields to exclude, and for nested sequences (such as above), we also specify the index(es) to which we want to exclude the nested field.
The special value __all__
allows us to exclude the nested field from all elements of the sequence.
These are very useful features in Pydantic. We can control which fields should be excluded and included when converting our models to other data structures, and it's very easy to do.
Model Config Classes
Let's now see what Config
classes can do in Pydantic models.
So far, we've seen how to customize individual fields. However, there are settings that can be applied across the entire Pydantic model. These can be defined in a special inner-class within the model, called Config
.
Let's start with a simple example. Our Student model has a department
field, whose type is set to a Python enum.
class DepartmentEnum(Enum):
ARTS_AND_HUMANITIES = 'Arts and Humanities'
LIFE_SCIENCES = 'Life Sciences'
SCIENCE_AND_ENGINEERING = 'Science and Engineering'
class Student(BaseModel):
department: DepartmentEnum
When we create a model from our data, the resulting value of the field is set to the raw enum. An example is shown below:
# fetch the raw JSON data from Github
url = 'https://raw.githubusercontent.com/bugbytes-io/datasets/master/students_v2.json'
data = requests.get(url).json()
# create model from the first element of the web-data
model = Student(**data[0])
print(model.department)
The output for the model's department
field is set to DepartmentEnum.SCIENCE_AND_ENGINEERING
. To get its raw string value, we would need to access the .value
attribute of this enum field.
Rather than having to explicitly do this, you can use model Config
classes to tell Pydantic that it should always output enum values, rather than the raw enum itself, as below (lines 11-12):
class DepartmentEnum(Enum):
ARTS_AND_HUMANITIES = 'Arts and Humanities'
LIFE_SCIENCES = 'Life Sciences'
SCIENCE_AND_ENGINEERING = 'Science and Engineering'
# Pydantic model to outline structure/types of Students (including nested model)
class Student(BaseModel):
department: DepartmentEnum
class Config:
use_enum_values = True
The use_enum_values
field of the Config
class performs this function. Now, if we run the same code as before, the output of model.department
will be the raw string for the enum field: "Science and Engineering".
With this configuration, all enum fields in the class will output the raw value when accessing the field or dumping it to a dictionary with the model.dict()
method.
Let's move on. Another useful field in the Config
class is the extra
field, which tells Pydantic how to behave when instantiating a model with extra fields that are not defined on the class.
The extra
field can take on three values:
ignore
- do nothing when encountering extra attributes.allow
- assign the extra attributes to the modelforbid
- cause validation to fail with aValidationError
if extra attributes are passed to the model
To demonstrate this, let's remove a field from our Pydantic model. Let's say we remove the fees_paid
boolean field (see the previous post), and have the following models in our application:
# define an Enum of acceptable Department values
class DepartmentEnum(Enum):
ARTS_AND_HUMANITIES = 'Arts and Humanities'
LIFE_SCIENCES = 'Life Sciences'
SCIENCE_AND_ENGINEERING = 'Science and Engineering'
# Pydantic model to outline structure/types of Modules
class Module(BaseModel):
id: Union[uuid.UUID, int]
name: str
professor: str
credits: Literal[10,20]
registration_code: str
# Pydantic model to outline structure/types of Students (including nested model)
class Student(BaseModel):
id: uuid.UUID
student_name: str = Field(alias="name")
date_of_birth: date = Field(default_factory=lambda: datetime.today().date())
GPA: confloat(ge=0, le=4)
course: Optional[str]
department: DepartmentEnum
modules: list[Module] = Field(default=[])
class Config:
use_enum_values = True
Now, our model is missing a field that's defined in our incoming data here. So the question is: what should the model do when we instantiate it and pass this field that it does not know about?
Let's start by adding the extra key to our Config
class, and setting it to ignore
(which is the default):
class Student(BaseModel):
id: uuid.UUID
student_name: str = Field(alias="name")
date_of_birth: date = Field(default_factory=lambda: datetime.today().date())
GPA: confloat(ge=0, le=4)
course: Optional[str]
department: DepartmentEnum
modules: list[Module] = Field(default=[])
class Config:
use_enum_values = True
extra = 'ignore'
Now, any extra attributes that are provided to the Pydantic model are silently ignored.
Often, we might want to explicitly forbid extra data being set on our model and potentially serialized later in the workflow. Imagine a rogue password or API key, for example. We can use extra='forbid'
to achieve this.
On the other hand, we might want to be flexible with our data model, and may need to accept other attributes that are not explicitly defined in the Pydantic model. We can use extra='allow'
for this.
This attribute is useful, but there are many others that you can define within a Config
class - for example, the anystr_strip_whitespace
attribute that will handle stripping rogue whitespace from incoming string/byte data.
For a list of all available Config
class fields, see the Pydantic documentation here.
Summary
In this post, we've covered some more useful Pydantic concepts. We've seen how to use the Field()
function to set defaults, including using the default_factory
keyword-argument to set a dynamic default value. We've also seen how to use the alias
keyword-argument to handle the case where fields on your model have different names than the incoming source data.
We looked at more advanced uses of the model.dict()
and model.json()
functions, used to export Pydantic models to dictionaries and JSON strings, respectively. We saw how to exclude certain fields from the output, and saw that there's also a mechanism for specifying which fields to include (only).
Finally, we learned about the model Config
inner-class, which can be used to set model-wide configuration. This can be used to perform cleanup of data - for example, stripping whitespace from all string/byte data - and also to transform values in the class, for example by setting enum types to their raw values. We also learned about the important extra
attribute, which controls what Pydantic models do when they encounter attributes that are not explicitly defined in the model class.
In the next post, we'll dive deeper into model validation techniques.
If you enjoyed this post, please subscribe to our YouTube channel and follow us on Twitter to keep up with our new content!
Please also consider buying us a coffee, to encourage us to create more posts and videos!