chore: adds documentation for the new ref_cache implementation

2025-11-25 23:50:58 -03:00
parent 20872d4a91
commit 268ac85667
3 changed files with 362 additions and 55 deletions
--- a/docs/source/usage.ref_cache.rst
+++ b/docs/source/usage.ref_cache.rst
@@ -0,0 +1,233 @@
+===============
+Reference Cache
+===============
+
+The reference cache is named after the mechanism used to implement
+the `$ref` keyword in the JSON Schema specification.
+
+Internally, the cache is used by both :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>`
+and :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>`.
+However, only :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>` exposes the cache through a supported API;
+:py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>` uses the cache internally and does not provide access to it.
+
+
+-----------------------------------------
+Configuring and Using the Reference Cache
+-----------------------------------------
+
+The reference cache can be used in three ways:
+
+* Without a persistent reference cache (no sharing between calls).
+* Passing an explicit ``ref_cache`` dictionary to a call.
+* Using the converter instance's default cache (the instance-level cache).
+
+
+Usage Without Reference Cache
+=============================
+
+When you run the library without a persistent reference cache, the generated
+types are not stored for reuse. Each call to a build method creates fresh
+Pydantic model classes (they will have different Python object identities).
+Because nothing is cached, you cannot look up generated subtypes later.
+
+This is the default behaviour of :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>`.
+You can achieve the same behaviour with :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>` by
+passing ``without_cache=True``.
+
+
+Usage: Manually Passing a Reference Cache
+=========================================
+
+You can create and pass your own mutable mapping (typically a plain dict)
+as the reference cache. This gives you full control over sharing and
+lifetime of cached types. When two converters share the same dict, types
+created by one converter will be reused by the other.
+
+.. code-block:: python
+
+    from jambo import SchemaConverter
+
+    # a shared cache you control
+    shared_cache = {}
+
+    converter1 = SchemaConverter(shared_cache)
+    converter2 = SchemaConverter(shared_cache)
+
+    model1 = converter1.build_with_cache(schema)
+    model2 = converter2.build_with_cache(schema)
+
+    # Because both converters use the same cache object, the built models are the same object
+    assert model1 is model2
+
+If you prefer a per-call cache (leaving the converter's instance cache unchanged), pass the ``ref_cache`` parameter to
+:py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>`:
+
+.. code-block:: python
+
+    # pass an explicit, private cache for this call only
+    model_a = converter1.build_with_cache(schema, ref_cache={})
+    model_b = converter1.build_with_cache(schema, ref_cache={})
+
+    # because each call received a fresh dict, the resulting model classes are distinct
+    assert model_a is not model_b
+
+
+Usage: Using the Instance Default (Instance-level) Cache
+=======================================================
+
+By default, a :class:`SchemaConverter` instance creates and keeps an internal
+reference cache (a plain dict). Reusing the same converter instance across
+multiple calls will reuse that cache and therefore reuse previously generated
+model classes.
+
+.. code-block:: python
+
+    converter = SchemaConverter()  # has its own internal cache
+
+    model1 = converter.build_with_cache(schema)
+    model2 = converter.build_with_cache(schema)
+
+    # model1 and model2 are the same object because the instance cache persisted
+    assert model1 is model2
+
+If you want to temporarily avoid using the instance cache for a single call,
+use ``without_cache=True``. That causes :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>` to
+use a fresh, empty cache for the duration of that call only:
+
+.. code-block:: python
+
+    model1 = converter.build_with_cache(schema, without_cache=True)
+    model2 = converter.build_with_cache(schema, without_cache=True)
+
+    # each call used a fresh cache, so the models are distinct
+    assert model1 is not model2
+
+
+Inspecting and Managing the Cache
+=================================
+
+The converter provides a small, explicit API to inspect and manage the
+instance cache.
+
+Retrieving cached types
+-----------------------
+
+:py:meth:`SchemaConverter.get_cached_ref <jambo.SchemaConverter.get_cached_ref>`(name) — returns a cached model class or ``None``.
+
+Retrieving the root type of the schema
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When retrieving the root type of a schema, pass the schema's ``title`` property as the name.
+
+.. code-block:: python
+
+    from jambo import SchemaConverter
+
+    converter = SchemaConverter()
+
+    schema = {
+        "title": "person",
+        "type": "object",
+        "properties": {
+            "name": {"type": "string"},
+            "age": {"type": "integer"},
+        },
+    }
+
+    person_model = converter.build_with_cache(schema)
+    cached_person_model = converter.get_cached_ref("person")
+
+
+Retrieving a subtype
+~~~~~~~~~~~~~~~~~~~~
+
+When retrieving a subtype, pass a path string (for example, ``parent_name.field_name``) as the name.
+
+.. code-block:: python
+
+    from jambo import SchemaConverter
+
+    converter = SchemaConverter()
+
+    schema = {
+        "title": "person",
+        "type": "object",
+        "properties": {
+            "name": {"type": "string"},
+            "age": {"type": "integer"},
+            "address": {
+                "type": "object",
+                "properties": {
+                    "street": {"type": "string"},
+                    "city": {"type": "string"},
+                },
+                "required": ["street", "city"],
+            },
+        }
+    }
+
+    person_model = converter.build_with_cache(schema)
+    cached_address_model = converter.get_cached_ref("person.address")
+
+
+
+Retrieving a type from ``$defs``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When retrieving a type defined in ``$defs``, access it directly by its name.
+
+.. code-block:: python
+
+    from jambo import SchemaConverter
+
+    converter = SchemaConverter()
+
+    schema = {
+        "title": "person",
+        "type": "object",
+        "properties": {
+            "name": {"type": "string"},
+            "age": {"type": "integer"},
+            "address": {"$ref": "#/$defs/address"},
+        },
+        "$defs": {
+            "address": {
+                "type": "object",
+                "properties": {
+                    "street": {"type": "string"},
+                    "city": {"type": "string"},
+                },
+                "required": ["street", "city"],
+            }
+        },
+    }
+
+    person_model = converter.build_with_cache(schema)
+    cached_address_model = converter.get_cached_ref("address")
+
+
+Clearing the cache
+------------------
+
+:py:meth:`SchemaConverter.clear_ref_cache <jambo.SchemaConverter.clear_ref_cache>`() — removes all entries from the instance cache.
+
+
+Notes and Behavioural Differences
+================================
+
+* :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>` does not expose or persist an instance cache. If you call it without
+  providing a ``ref_cache`` it will create and use a temporary cache for that
+  call only; nothing from that call will be available later via
+  :py:meth:`SchemaConverter.get_cached_ref <jambo.SchemaConverter.get_cached_ref>`.
+
+* :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>` is the supported entry point when you want
+  cache control: it uses the instance cache by default, accepts an explicit
+  ``ref_cache`` dict for per-call control, or uses ``without_cache=True`` to
+  run with an ephemeral cache.
+
+
+References in the Test Suite
+============================
+
+These behaviours are exercised in the project's tests; see :mod:`tests.test_schema_converter`
+for examples and additional usage notes.
--- a/docs/source/usage.rst
+++ b/docs/source/usage.rst
@@ -1,9 +1,15 @@
+===================
 Using Jambo
 ===================

-Jambo is designed to be easy to use, it doesn't require any complex setup or configuration.
-Below a example of how to use Jambo to convert a JSON Schema into a Pydantic model.
+Jambo is designed to be easy to use. It doesn't require complex setup or configuration when not needed, while providing more powerful instance methods when you do need control.

+Below is an example of how to use Jambo to convert a JSON Schema into a Pydantic model.
+
+
+-------------------------
+Static Method (no config)
+-------------------------

 .. code-block:: python

@@ -15,8 +21,16 @@ Below a example of how to use Jambo to convert a JSON Schema into a Pydantic mod
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
+            "address": {
+                "type": "object",
+                "properties": {
+                    "street": {"type": "string"},
+                    "city": {"type": "string"},
+                },
+                "required": ["street", "city"],
+            },
        },
-        "required": ["name"],
+        "required": ["name", "address"],
    }

    Person = SchemaConverter.build(schema)
@@ -26,16 +40,76 @@ Below a example of how to use Jambo to convert a JSON Schema into a Pydantic mod
    # Output: Person(name='Alice', age=30)


-The :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>` static method takes a JSON Schema dictionary and returns a Pydantic model class. You can then instantiate this class with the required fields, and it will automatically validate the data according to the schema.
+The :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>` static method takes a JSON Schema dictionary and returns a Pydantic model class.

-If passed a description inside the schema it will also add it to the Pydantic model using the `description` field. This is useful for AI Frameworks as: LangChain, CrewAI and others, as they use this description for passing context to LLMs.
+Note: the static ``build`` method was the original public API of this library and is kept for backwards compatibility. It creates and returns a model class for the provided schema but does not expose or persist an instance cache.


-For more complex schemas and types see our documentation on
+--------------------------------
+Instance Method (with ref cache)
+--------------------------------
+
+.. code-block:: python
+
+    from jambo import SchemaConverter
+
+    converter = SchemaConverter()
+
+    schema = {
+        "title": "Person",
+        "type": "object",
+        "properties": {
+            "name": {"type": "string"},
+            "age": {"type": "integer"},
+            "address": {
+                "type": "object",
+                "properties": {
+                    "street": {"type": "string"},
+                    "city": {"type": "string"},
+                },
+                "required": ["street", "city"],
+            },
+        },
+        "required": ["name", "address"],
+    }
+
+    # The instance API (build_with_cache) populates the converter's instance-level reference cache
+    Person = converter.build_with_cache(schema)
+
+    obj = Person(name="Alice", age=30)
+    print(obj)
+    # Output: Person(name='Alice', age=30)
+
+    # When using the converter's built-in instance cache (no ref_cache passed to the call),
+    # all object types parsed during the build are stored and can be retrieved via get_cached_ref.
+
+    cached_person_model = converter.get_cached_ref("Person")
+    assert Person is cached_person_model  # the cached class is the same object that was built
+
+    # A nested/subobject type can also be retrieved from the instance cache
+    cached_address_model = converter.get_cached_ref("Person.address")
+
+
+The :py:meth:`SchemaConverter.build_with_cache <jambo.SchemaConverter.build_with_cache>` instance method was added after the
+initial static API to make it easier to access and reuse subtypes defined in a schema.
+Unlike the original static :py:meth:`SchemaConverter.build <jambo.SchemaConverter.build>`,
+the instance method persists and exposes the reference cache and provides helpers such as
+:py:meth:`SchemaConverter.get_cached_ref <jambo.SchemaConverter.get_cached_ref>` and
+:py:meth:`SchemaConverter.clear_ref_cache <jambo.SchemaConverter.clear_ref_cache>`.
+
+For details and examples about the reference cache and the different cache modes (instance cache, per-call cache, ephemeral cache), see:
+
+.. toctree::
+    usage.ref_cache
+
+
+Type System
+-----------
+
+For a full explanation of the supported schemas and types see our documentation on types:

 .. toctree::
    :maxdepth: 2
-    :caption: Contents:

    usage.string
    usage.numeric