Blog post - How far can LLMs automate triage decisions of the UK's merger control authority?

Playing with LLMs to automate the triage work of the CMA

Tony Curzon Price, 8/9/24

Credits to DALL-E for the image, generated with prompt:
Create an image in the style of 1960s sci-fi comics. The scene should depict a robot sitting at a desk piled high with files and documents. The robot looks overwhelmed as it decides whether to throw a file into the trash bin or pass it up to a human superior. The background should have a retro-futuristic office setting, with bright colors and bold lines typical of comic book art from the 1960s. The robot should have an appearance reflective of the era's vision of future technology, with elements like exposed gears, antennae, and metallic limbs.

When mergers come to the attention of the UK's competition authority, the CMA, it needs to decide whether to investigate the transaction. There are two levels of triage: first, should the authority decide to investigate at all, and second, having performed a very rapid assessment, should it proceed to an in-depth assessment. This note describes progress on the automation of the early stages of the triage task: whether to perform a quick assessment at all, and within that quick assessment (known as Phase 1), whether to go onto a deeper investigation (known as Phase 2).

I am working with the example of the recently announced decision on whether or not to start a Phase 1 investigation of the proposed merger of Synopsys and ANSYS. These are two software vendors in the highly technical domain of engineering design automation.

I started by automating the listing of the major products of the two companies with the following prompt:

    
        prompt_string = f"Please return a list of the products that {company} sells. Please return this in json format. /n /n  For example, if the company is 'ANSYS', the first 2 lines of the response would be as follows: /n {output_example}"

        system_msg = "You are an artificial intelligence assistant and you need to engage in a helpful, detailed, polite conversation with a user."

I next created a Synopsys x ANSYS matrix of products and asked the LLM to score the degree to which the user of one product might instead use any of the other products. My prompt was this:

    
        prompt_string = f"""What is the likelihood that you'd be happy to substitue {comp2}'s product '{col_name}' for the use you make of {comp1}'s '{comp1_prod}'. Please return your answer as a decimal between 0 and 1 in json format. /n /n  For example, if you were asked whether you'd be happy to substitute ANSYS's ANSYS Mechanical for Synopsys's Fusion Compiler, your answer would be : /n {output_example} /n /n Another example would be if you were asked if you'd be happy to substitute Cadence's Genus Synthesis Solution for Synopsys's Fusion Compiler, your answer would be {{"substitute": 0.7}} /n /n Another example would be if you were asked if you'd be happy to substitute Microsoft's Office for Synopsys's Fusion Compiler, your answer would be {{"substitute": 0.0}} """

        system_msg = f"Imagine you are an expert user of {comp1}'s product '{comp1_prod}' and that you are helpfully answering a market research questionnaire. "

This produced the following table:

	comp1_prod	ANSYS Mechanical	ANSYS CFX	ANSYS Chemkin-Pro	ANSYS Maxwell	ANSYS OptiSLang	ANSYS Discovery	ANSYS LS-DYNA	ANSYS Medini Analyze	ANSYS Twin Builder
0	Fusion Compiler	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
1	Design Compiler	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
2	IC Compiler II	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
3	VCS	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
4	Custom Compiler	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.1
5	HSPICE	0.0	0.0	0.0	0.0	0.1	0.0	0.0	0.0	0.2
6	PrimeTime	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
7	Simpleware	0.2	0.1	0.0	0.0	0.1	0.3	0.2	0.1	0.1
8	Sentaurus	0.0	0.0	0.1	0.2	0.1	0.1	0.0	0.0	0.1
9	Synplify	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0

In order to calibrate this table, I asked the LLM who was the closest competitor to Synopsys - Cadence, it turns out - and performed the same exercise with Cadence's products, yielding the following matrix:

	comp1_prod	Cadence Virtuoso	Cadence Innovus	Cadence Allegro	Cadence Spectre	Cadence PSpice	Cadence Genus	Cadence Modus	Cadence JasperGold	Cadence Xcelium	Cadence Palladium
0	Design Compiler	0.0	0.3	0.0	0.0	0.0	0.4	0.2	0.0	0.0	0.1
1	IC Compiler	0.1	0.6	0.1	0.0	0.0	0.3	0.0	0.0	0.0	0.0
2	PrimeTime	0.0	0.1	0.0	0.0	0.0	0.1	0.0	0.1	0.0	0.0
3	VCS	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.2	0.6	0.2
4	HSPICE	0.2	0.1	0.1	0.5	0.2	0.1	0.0	0.0	0.0	0.1
5	Custom Compiler	0.3	0.1	0.1	0.2	0.1	0.2	0.1	0.0	0.0	0.0
6	Sentaurus TCAD	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
7	SpyGlass	0.1	0.0	0.0	0.1	0.0	0.2	0.1	0.3	0.1	0.1
8	Sabre	0.3	0.0	0.2	0.3	0.4	0.1	0.2	0.0	0.1	0.1
9	ZeBu	0.0	0.0	0.1	0.0	0.0	0.0	0.1	0.1	0.1	0.4
10	Verdi	0.1	0.0	0.0	0.0	0.0	0.0	0.0	0.2	0.0	0.0
11	Coverity	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.1	0.0	0.0

The result is quite strikingly different - it is immediately clear that there is a great deal of overlap in the Synopsys/Cadence offering and very little in the Synopsys/ANSYS offering. This would suggest, prima facie, that there will be little chance of a horizontal theory of harm case in the proposed merger.

It should be noted that every step of the above process can be automated in a relatively general way - give the programme any 2 company names, and it can perform this task.

Next, to automating an indicator of vertical theories of harm.

For this, I take a close competitor of one of the merging parties and for each of the products of the competitor, I ask whether the products of the other merging party constitute an important input. The prompt that I actually use at the moment is:

    
        prompt_string = f"""You are involved in a project that requires you to be using  {competitor}'s '{competitor_prod}' product. Use that information to infer what the end goal of the project is likely to be. How likely is it that in order to achieve that end-goal, you will absolutely need to use the {col_name} product from {comp2}?  /n Please return your answer as a decimal between 0 and 1 in json format. /n /n  For example, if you were using Cadence's Virtuoso, it is likely that you are working in a team doing a chip design and that you or someone in the team would need also to use ANSYS's Ansys RedHawk. Therefore, for that pair of products, you would return : /n {output_example} /n /n Another example would be if you were using Cadence's Virtuoso, would you or someone also need to use a Tesla Model X? Since driving a Tesla is not essential to any task in teams that are designing chips, your answer would be: {{"vertical": 0.0}}  """

        system_msg = f"Imagine you are a user of {competitor}'s product '{competitor_prod}' and that you are helpfully answering a market research questionnaire. "

This prompt is meant to capture the thinking required to determine whether a merging party would have the ability to foreclose or raise rivals' costs by controlling the products of the other merging party, to the detriment of competitors.

The result of this prompt when I use Cadence and Ansys as the competitor and the relevant mergee (ie the mergee with the lesser horizontal links to the competitor) is the following matrix, with Cadence products in rows:

	competitor	ANSYS Mechanical	ANSYS Fluent	ANSYS Electronics Desktop	ANSYS HFSS	ANSYS Maxwell	ANSYS Discovery	ANSYS Chemkin-Pro	ANSYS Autodyn	ANSYS Lumerical	ANSYS Cloud	ANSYS Twin Builder	ANSYS Speos
0	Cadence Allegro PCB Designer	0.3	0.3	0.6	0.5	0.4	0.6	0.05	0.2	0.2	0.3	0.4	0.2
1	Cadence Virtuoso	0.2	0.2	0.5	0.3	0.2	0.2	0.10	0.2	0.2	0.3	0.3	0.1
2	Cadence Spectre	0.2	0.1	0.5	0.3	0.3	0.3	0.10	0.1	0.2	0.3	0.4	0.3
3	Cadence Innovus	0.2	0.1	0.7	0.7	0.1	0.3	0.10	0.1	0.2	0.4	0.3	0.1
4	Cadence OrCAD	0.2	0.2	0.2	0.2	0.3	0.3	0.10	0.3	0.2	0.3	0.2	0.2
5	Cadence Sigrity	0.3	0.3	0.4	0.7	0.5	0.4	0.10	0.2	0.2	0.2	0.2	0.1
6	Cadence Xcelium	0.1	0.1	0.3	0.3	0.2	0.2	0.10	0.1	0.2	0.1	0.3	0.1

As a cross-check, I ran the same algorithm for Cadence and Tesla, where one would expect zeros throughout. The result was this:

	competitor	Tesla Model S	Tesla Model 3	Tesla Model X	Tesla Model Y	Tesla Cybertruck	Tesla Roadster	Tesla Semi	Tesla Solar Panels	Tesla Solar Roof	Tesla Powerwall	Tesla Powerpack	Tesla Megapack
0	Virtuoso	0.00	0.0	0.0	0.0	0.00	0.0	0.0	0.00	0.0	0.0	0.1	0.1
1	Spectre	0.00	0.0	0.0	0.0	0.00	0.1	0.1	0.10	0.1	0.1	0.1	0.0
2	Xcelium	0.00	0.0	0.0	0.0	0.01	0.0	0.0	0.00	0.0	0.2	0.1	0.1
3	Innovus	0.00	0.1	0.0	0.0	0.00	0.1	0.1	0.00	0.0	0.1	0.1	0.1
4	JasperGold	0.00	0.1	0.1	0.1	0.00	0.0	0.0	0.00	0.0	0.0	0.1	0.0
5	Palladium	0.05	0.0	0.0	0.1	0.10	0.0	0.0	0.00	0.0	0.1	0.1	0.1
6	Protium	0.00	0.0	0.0	0.0	0.00	0.0	0.1	0.00	0.0	0.0	0.1	0.1
7	Genus	0.00	0.0	0.0	0.1	0.00	0.0	0.0	0.00	0.0	0.0	0.2	0.0
8	Tempus	0.00	0.0	0.1	0.0	0.00	0.0	0.1	0.10	0.1	0.1	0.1	0.1
9	Innovium	0.00	0.0	0.0	0.0	0.00	0.0	0.1	0.01	0.1	0.1	0.2	0.2

Although there is clearly some noise in the Cadence/Tesla table, the differences between the two tables are marked and are very much going in the right direction.

In other words, what the work to date has shown is that in this case, an entirely automated flow suggests that in the Ansys/Synopsys case:

There is little risk of a horizontal harm
There is some risk of a vertical theory of harm

Is this enough to automate a decision to proceed with a phase 1 investigation? I would say that this, together perhaps with filters for size and strategic importance, is good enough. It certainly suggests that LLMs have a role in increasing the productivity of authorities and advisors.

When I next have time to play with this, I will make a version that I can put online, so that anyone can input any 2 companies and generate an automated opinion as to whether the merger should go to a more involved - and human - triage.