How to do a Protocol level Diff/Compare

Ok, so I ran my regression tests after making a few minor RTL changes and guess what?! Some tests failed! They were passing and now they fail! Aargh! Wish I had a quick way to compare the passing and failing runs…

We have all experienced this situation where we want to debug by comparison. Performing a comparison can be a quick way to identify what changed and thus root cause the failure.

While source code files can be easily compared using diff tools, comparing simulation output (logs and waveforms) is not straightforward. Every time you run your simulation, the time stamps for various events will change. Also, the interleaving between parallel threads in hardware might change as shown in following Figures.

thread_sequence1thread_sequence2thread_sequence3

Run1 and Run2 have a different overall execution sequence but they are esentially the same when you look at a per thread level. In Run3, on the other hand, thread 1 executes differently and this is the kind of difference that we are interested in.

These two factors mean that we cannot use text file comparison tools for comparing simulation output. What we need is a comparison tool that is protocol aware, so that it can isolate the various parallel threads of activity. The tool should also be able to do untimed comparison by ignoring the time at which a particular activity occurred.

This is precisely what our PDA tool does. The tool understands protocols (such as USB, PCIE, DDR, AMBA, etc and also custom/proprietary protocols) and thus is able to isolate parallel threads of activity and hence do thread-wise untimed comparisons. This way, the tool can identify the relevant activity difference between two simulations and help with quick identification of issues.

My test failed with some vague scoreboard error related to data integrity. To identify the issue, I launched the PDA tool and performed a comparison between my current failing run and a previously passing reference run of the test. I directed the tool to compare the activity on the AXI interface and this is what I got:

without_time

This very clearly tells me that for one particular AXI master, there is a difference in the order of reads and writes for a particular master. This causes the incorrect memory address to be programmed into the DMA and causes the data integrity problem.

Thus the PDA tool helped me in performing a quick comparative debug. And the best part is that I can fully automate this comparison. I can automatically compare every regression failure with the last passing state for that test.

Author: Aditya Mittal

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s