Note

Click here to download the full example code

# Dense Subgraphs¶

*Technical details are available in the API documentation:* sf.apps.subgraph

Graphs can be used to model a wide variety of concepts: social networks, financial markets, biological networks, and many others. A common problem of interest is to find subgraphs that contain a large number of connections between their nodes. These subgraphs may correspond to communities in social networks, correlated assets in a market, or mutually influential proteins in a biological network.

Mathematically, this task is known as the dense subgraph problem. The density of a \(k\)-node subgraph is equal to the number of its edges divided by the maximum possible number of edges. Identifying the densest graph of a given size, known as the densest-\(k\) subgraph problem, is NP-Hard.

As shown in [44], a defining feature of GBS is that when we encode a graph into a GBS device, it samples dense subgraphs with high probability. This property can be used to find dense subgraphs by sampling from a GBS device and postprocessing the outputs. Let’s take a look!

## Finding dense subgraphs¶

The first step is to import all required modules. We’ll need the `data`

module to load pre-generated samples, the `sample`

module to postselect samples, the
`subgraph`

module to search for dense subgraphs, and the `plot`

module to
visualize the graphs. We’ll also use Plotly which is required for the `plot`

module and
NetworkX for graph operations.

```
from strawberryfields.apps import data, sample, subgraph, plot
import plotly
import networkx as nx
```

Here we’ll study a 30-node graph with a planted 10-node graph, as considered in [44]. The graph is generated by joining two Erdős–Rényi random graphs. The first graph of 20 nodes is created with edge probability of 0.5. The second planted graph is generated with edge probability of 0.875. The planted nodes are the last ten nodes in the graph. The two graphs are joined by selecting 8 nodes at random from both graphs and adding an edge between them. This graph has the sneaky property that even though the planted subgraph is the densest of its size, its nodes have a lower average degree than the nodes in the rest of the graph.

The `data`

module has pre-generated GBS samples from this graph. Let’s load them,
postselect on samples with a large number of clicks, and convert them to subgraphs:

```
planted = data.Planted()
postselected = sample.postselect(planted, 16, 30)
pl_graph = nx.to_networkx_graph(planted.adj)
samples = sample.to_subgraphs(postselected, pl_graph)
print(len(samples))
```

Out:

```
2181
```

Not bad! We have more than 2000 samples to play with 😎. The planted subgraph is actually easy to identify; it even appears clearly from the force-directed Kamada-Kawai algorithm that is used to plot graphs in Strawberry Fields:

```
sub = list(range(20, 30))
plot_graph = plot.graph(pl_graph, sub)
plotly.offline.plot(plot_graph, filename="planted.html")
```