Specs¶
Business Requirements Specification¶
- Know what has changed between two channel “states” (e.g. diff(staged,main) studio trees)
- Desirable changes to content channels propagate quickly through the content pipeline
- Unnecessary and undesirable changes are stopped and corrected at an early state of the pipeline
- Is able to process large trees like Khan Academy (think 500MB of uncompressed JSON)
Software Requirements Specification¶
Can be subdivided into essential functionality, optional, and stretch goals.
The essential objectives are:
- Produce a “summary diff” that contains only the counts of added/removed/moved/modified
- Given any two trees (ricecooker, studio, kolibri) produce the detailed diff information about the differences (added/removed/moved/modified) between the two trees
Optional (depending on frontend needs):
- Remove redundacy in diff format (e.g. show only node move instead of node add, node delete, and node move)
- Post-process the diff format to make it most convenient for presenting to users
Stretch goals:
- Post-process the diff to make a minimum description length and avoid information overload (e.g. instead of showing 30 content nodes added, display the diff as the action of adding a single topic node)
- Support kinds of trees:
- JSON: ricecooker, studio wire format, studio trees
- Django ORM: Studio, Kolibri
- Basic ORM: standalone script processing of kolibri sqlite3 DB files