|
| 1 | +(pydantic_basemodel)= |
| 2 | + |
| 3 | +# Pydantic BaseModel |
| 4 | + |
| 5 | +```{eval-rst} |
| 6 | +.. tags:: Basic |
| 7 | +``` |
| 8 | + |
| 9 | +`flytekit` version >=1.14 supports natively the `JSON` format that Pydantic `BaseModel` produces, enhancing the |
| 10 | +interoperability of Pydantic BaseModels with the Flyte type system. |
| 11 | + |
| 12 | +:::{important} |
| 13 | +Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. |
| 14 | +::: |
| 15 | + |
| 16 | +With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, |
| 17 | +overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: |
| 18 | + |
| 19 | +to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. |
| 20 | + |
| 21 | +:::{important} |
| 22 | +By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing, preserving the types defined in your `BaseModel` class. |
| 23 | +If you're serializing `BaseModel` using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct` literal instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. |
| 24 | + |
| 25 | +For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md |
| 26 | +::: |
| 27 | + |
| 28 | +```{note} |
| 29 | +You can put Dataclass and FlyteTypes (FlyteFile, FlyteDirectory, FlyteSchema, and StructuredDataset) in a pydantic BaseModel. |
| 30 | +``` |
| 31 | + |
| 32 | +```{note} |
| 33 | +To clone and run the example code on this page, see the [Flytesnacks repo][flytesnacks]. |
| 34 | +``` |
| 35 | + |
| 36 | +To begin, import the necessary dependencies: |
| 37 | + |
| 38 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 39 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 40 | +:lines: 1-9 |
| 41 | +``` |
| 42 | + |
| 43 | +Build your custom image with ImageSpec: |
| 44 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 45 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 46 | +:lines: 11-14 |
| 47 | +``` |
| 48 | + |
| 49 | +## Python types |
| 50 | +We define a `pydantic basemodel` with `int`, `str` and `dict` as the data types. |
| 51 | + |
| 52 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 53 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 54 | +:pyobject: Datum |
| 55 | +``` |
| 56 | + |
| 57 | +You can send a `pydantic basemodel` between different tasks written in various languages, and input it through the Flyte console as raw JSON. |
| 58 | + |
| 59 | +:::{note} |
| 60 | +All variables in a data class should be **annotated with their type**. Failure to do should will result in an error. |
| 61 | +::: |
| 62 | + |
| 63 | +Once declared, a dataclass can be returned as an output or accepted as an input. |
| 64 | + |
| 65 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 66 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 67 | +:lines: 26-41 |
| 68 | +``` |
| 69 | + |
| 70 | +## Flyte types |
| 71 | +We also define a data class that accepts {std:ref}`StructuredDataset <structured_dataset>`, |
| 72 | +{std:ref}`FlyteFile <files>` and {std:ref}`FlyteDirectory <folder>`. |
| 73 | + |
| 74 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 75 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 76 | +:lines: 45-86 |
| 77 | +``` |
| 78 | + |
| 79 | +A data class supports the usage of data associated with Python types, data classes, |
| 80 | +flyte file, flyte directory and structured dataset. |
| 81 | + |
| 82 | +We define a workflow that calls the tasks created above. |
| 83 | + |
| 84 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 85 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 86 | +:pyobject: basemodel_wf |
| 87 | +``` |
| 88 | + |
| 89 | +You can run the workflow locally as follows: |
| 90 | + |
| 91 | +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py |
| 92 | +:caption: data_types_and_io/pydantic_basemodel.py |
| 93 | +:lines: 99-100 |
| 94 | +``` |
| 95 | + |
| 96 | +To trigger a task that accepts a dataclass as an input with `pyflyte run`, you can provide a JSON file as an input: |
| 97 | +``` |
| 98 | +pyflyte run \ |
| 99 | + https://raw.githubusercontent.com/flyteorg/flytesnacks/b71e01d45037cea883883f33d8d93f258b9a5023/examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py \ |
| 100 | + basemodel_wf --x 1 --y 2 |
| 101 | +``` |
| 102 | + |
| 103 | +[flytesnacks]: https://github.com/flyteorg/flytesnacks/tree/master/examples/data_types_and_io/ |
0 commit comments