KST: Kaitai Struct Tests

Traditionally, testing Kaitai Struct relied on running specs which do the following:

for every target language
given:
- an input binary file,
- a library generated by Kaitai Struct Compiler for that target language
we run library’s parsing procedure on an input file, getting ourselves some structure in memory with the parsed contents of input file
finally, we compare the actual results of parsing with expected result: we take certain key members of structure in memory and assert that they should be equal to what we expect, or within certain boundaries.

While the procedure itself is solid, it quickly becomes very labor-intensive to support that, as number of languages that KS supports grows. Adding a new test now requires addition of ~10-12 per-language spec files, and doing that manually is both inefficient and error-prone.

The solution for that comes from Kaitai Struct compiler itself: we already have an expression language that might be used to address structures in memory, so why not use that to generate test specs as well?

KST format

Cross-language test specs are described in KST (=Kaitai Struct Test) format, which is also a YAML-based format, similar to KSY.

id: foo_bar
data: expr_array.bin
asserts:
  - actual: aint_size
    expected: 4

Here:

id is a name of a test spec, which MUST match exactly the name of a KSY file in formats dir.
data is the name of input binary datafile, which MUST be present in src dir; by convention, we use ".bin" file extension for all these files.
asserts is a sequence of assertions that we’ll use to check that contents of parsed structure in memory match our expectations.

Every assertion has two parts:
- actual is a KS expression, executed in context of root element of KSY file that we’re tested, used to extract a certain member of structure in memory to check.
- expected is normally some constant that we expect it to match (but it is also a KS expression, so technically it can be non-constant).

Assertions are expected to be fatal (i.e. first failed assertion stops the test process) and are specifically executed in sequence in specified in asserts.

This file must be saved as foo_bar.kst (i.e. name MUST match the id and the name of relevant .ksy file) in spec/ks dir.

When test is expected to fail with an exception, a slightly different KST file is used:

id: valid_fail_contents
data: fixed_struct.bin
exception: ValidationNotEqualError<bytes>

No asserts are present in this case, but there is a top-level exception element, which lists the expected exception to be raised during the parsing stage of this test. Available exception names are listed in KSError.scala in kaitai-struct-compiler.

Invoking KST translator

To make use of KST specs, they need to be translated into actual target languages (just like KSY gets compiler into target languages), so there is a translator project to do that.

KST translator is heavily based of kaitai-struct-compiler, so one needs to compile and publish .jar files with kaitai-struct-compiler to local repository first:

# assuming we're in root project directory
cd compiler
sbt publishLocal

after that, one can invoke it using spec_kst_to_all script in tests:

KST translator 0.10
Usage: kst_translator [options] [<test_name>...]

  <test_name>...           source test names (.kst)
  -t, --target <language>  target languages (construct, cpp_stl_98, cpp_stl_11, csharp, go, java, javascript, nim, perl, php, python, ruby, rust, default: all)
  --all-specs              process all KST files available
  -f, --force              force overwrite specs in production spec dirs (default: generate in ../spec/ks/out)

For example, to generate specs in all possible target languages right in the all relevant dirs for one KST spec "foo_bar", one can run:

cd ../tests
./spec_kst_to_all -f foo_bar

Developing KST translator

TODO