PyCon Lithuania 2024

analyzing stdf production test data in the silicon manufacturing industry using construct
2024-04-04 , Room 228

The data amount and the complexity of the queries are not particularly large in this industry. The challenge comes from using the STDF format, a binary file format with roots in the 1980's.

A method to make this data source available to modern data analysis tools (jupyter/streamlit) using the construct library will be discussed. The focus is on how the data can be collected, converted and made available in a fast and efficient way, using both pypy and cpython.


The silicon industry data environment

In the silicon production industry, there is currently the outgoing but still widely used file format STDF in use to store production test results.

This ageing file format is well established and protected by strong institutional momentum. For example, TEMS (not a file format but messages are the scope) and RITdb are contending to replace it.

This presentation is going to show how to leverage the power of construct (https://github.com/construct/construct) together with pypy and cpython to transform the STDF data to parquet to make it accessible to modern and efficient analysis methods (polars/pandas dataframes).

The solutions enabled by construct

Using construct it is possible to copy/paste + search/replace the STDF file format specification into an implementation which can do both parsing and generating STDF files.

This is how the implementation of a segment of this looks like:

PGR_payload = construct.Struct("GRP_INDX" /      construct.Int16ul                                *  "Unique index associated with pin group",
                               "GRP_NAM"  /      construct.PascalString(construct.Byte, "ascii")  *  "Name of pin group length byte = 0",

This implementation allows us to easily create.:
- publication grade analysis using Jupyterbooks
- real-time dashboards using streamlit
- transformers which transform an STDF to another STDF (for example for NDA purposes, like in this presentation)
- store results in the STDF file format (vendor-independent test implementation and execution)

Leveraging the pypy just-in-time compiler, in our environment the bottleneck is network throughput.

Conclusion

Old binary file formats which are specified by tables are easily accessible to modern methods using construct.

Career.:
- 2013-present application engineering and validation/verification engineering at ams-osram
- 2011 – 2013 home automation at intrate(c)
- 2010 – 2011 RMA engineer at infineon
- 2009 graduation at the university of applied sciences FH JOANEUM
- 2007 – 2010 mixed signal ASIC development at FH JOANEUM
- 2008 participation at the CERN Summer Student programe
- 2001 – 2005 industrial automation at intrate(c)