Skip to content

i2mint/py2json

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

107 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py2json

A small toolkit to help serialize Python objects and callable references into JSON-friendly representations and reconstruct them. It provides:

  • Ctor — deconstruct/construct objects via a CONSTRUCTOR/ARGS/KWARGS dict representation.
  • fakit — a lightweight mini-language to express function calls (f, a, k) and execute them.
  • dotpath helpers — obj_to_dotpath and dotpath_to_obj for resolving dotted references.
  • helpers to extract function metadata (obj2dict.func_info_dict) and a JSON encoder that understands numpy and bytes.

Quick examples

py2json provides a small JsonCodec and make_json_codec factory that wire Ctor and fakit into a compact encode/decode interface. Instead of forcing you to pick from fixed protocol names, the factory accepts a path_parser argument which is either a callable Callable[[str], str] or a string used as the separator between module and object parts (default is '.').

This lets you support styles such as:

  • dotted: package.module.attr (default, path_parser='.')
  • colon: package.module:Class.attr (path_parser=':')

The codec normalizes path strings via the path_parser and evaluates $fak expressions via refakit (with an injectable func_loader for whitelisting).

Example (colon separator):

from py2json import make_json_codec

codec = make_json_codec(path_parser=':')
encoded = codec.encode('collections.namedtuple:MyTuple')
decoded = codec.decode(encoded)

Tools for json serialization of python objects

A peep a bit deeper

The JsonCodec instance that make_json_codec returns uses Ctor and fakit. Let's have a quick peep at those.

Create a namedtuple using Ctor and instantiate it:

from py2json.ctor import Ctor
from collections import namedtuple

ctor_jdict = Ctor.to_ctor_dict(namedtuple, args=('A', 'x y z'))
A = Ctor.construct(ctor_jdict)
inst = A('one', 'two', 'three')

Use fakit to express and run a call given a dotted path:

from py2json.fakit import fakit
fakit({'f': 'os.path.join', 'a': ['I', 'am', 'a', 'filepath']})

Resolve dotted references and round-trip:

from py2json.fakit import obj_to_dotpath, dotpath_to_obj
from inspect import Signature
dot = obj_to_dotpath(Signature.replace)
assert dotpath_to_obj(dot) is Signature.replace

Notes

  • Ctor will serialize callables to a JSON-friendly jdict with keys {module, name, attr} and can construct them back into callables or instantiated objects.
  • fakit accepts either a callable, a dotted string, or a small structure (f, a, k) and uses a configurable func_loader to resolve f. For security, supply a whitelist func_loader.

See misc/py2json_wip.ipynb for runnable demos.

Why py2json?

Here we tackle the problem of serializing a python object into a json.

Json is a convenient choice for web request responses or working with mongoDB for instance.

It is usually understood that we serialize an object to be able to deserialize it to recover the original object: Implicit in this is some definition of equality, which is not as trivial as it may seem. Usually some aspects of the deserialized object will be different, so we need to be clear on what should be the same.

For example, we probably don't care if the address of the deserialized object is different. But we probably care that it's key attributes are the same.

What should guide us in deciding what aspects of an object should be recovered?

Behavior.

The only value of an object is behavior that will ensue. This may be the behavior of all or some of the methods of a serialized instance, or the behavior of some other functions that will depend on the deserialized object.

Our approach to converting a python object to a json will touch on some i2i cornerstones that are more general: Conversion and contextualization.

Behavior equivalence: What do we need an object to have?

Say we are given the code below.

def func(obj):
    return obj.a + obj.b

class A:
    e = 2
    def __init__(self, a=0, b=0, c=1, d=10):
        self.a = a
        self.b = b
        self.c = c
        self.d = d
        
    def target_func(self, x=3):
        t = func(self)
        tt = self.other_method(t)
        return x * tt / self.e
    
    def other_method(self, x=1):
        return self.c * x

Which we use to make the following object

obj = A(a=2, b=3)

Say we want to json-serialize this so that a deserialized object dobj is such that for all valid obj, resulting dobj, and valid x input:

obj.target_func(x) == A.target_func(obj, x) == A.target_func(dobj, x)

The first equality is just a reminder of a python equivalence. The second equality is really what we're after.

When this is true, we'll say that obj and dobj are equivalent on A.target_func -- or just "equivalent" when the function(s) it should be equivalent is clear.

To satisfy this equality we need dobj to:

  • Contain all the attributes it needs to be able to compute the A.target_func function -- which means all the expressions contained in that function or, recursively, any functions it calls.
  • Such that the values of a same attribute of obj and dobj are equivalent (over the functions in the call try of the target function that involve these attributes.

Let's have a manual look at it. First, you need to compute func(self), which will require the attributes a and b. Secondly, you'll meed to computer other_method, which uses attribute c. Finally, the last expression, x * tt / self.e uses the attribute e.

So what we need to make sure we serialize the attributes: {'a', 'b', 'c', 'e'}.

That wasn't too hard. But it could get convoluted. Either way, we really should use computers for such boring tasks!

That's something py2json would like to help you with.

About

Tools for json serialization of python objects

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages