Molecular Networking

Molecular networking groups metabolites by the similarity between their MS2 spectra. Since MS2 spectra represent the chemical substructures within a molecule, molecular networking is essentially a way to cluster compounds by their structural similarity. 

Networks are composed of two major parts - nodes and edges.

Nodes represent each metabolite we measured in an experiment. Because a single experiment can generate hundreds of redundant MS2 corresponding to the same molecule, we cluster these MS2 spectra into a single node with a representative MS2 spectrum.

Edges are lines that connect nodes together - creating a link that, when applied to the whole experiment, will give us a structured network. However, if we draw an edge between every node, the result would be a gigantic hairball. So, we set a cosine similarity score cutoff that determines how similar spectra must be in order to draw an edge between nodes.

Image of two network nodes connected by an edge with a consine similarity > 0.7

Previously, we described how spectral library searching often leaves a majority of your metabolites unannotated. However, while these unknown molecules may not exactly match any of the molecules in your spectral library, there’s a good chance that they’ll be structurally similar to some of them. A molecular network will highlight which molecules in your library are the most similar to your unknown metabolite of interest. 

A molecular network is a great framework to start to dig into data from hundreds of samples and learn the chemistry behind them. Since most molecular networks will consist of many smaller sub-networks, we can see exactly which molecules have similar MS2 spectra. These sub-networks are called molecular families and often represent compound classes or slightly modified molecules.

Image showing molecular network, sub-network, and highlighting nodes, edges, and library matches

When paired with spectral libraries, you can discover which library compounds are most similar to your unknown compound of interest. For example, the molecular family below contains several nodes that match bile acids. Thus, it’s likely that the unknown metabolites are also bile acids. 

Image showing subnetwork highlight library matches for bile acids and unknown related compound

Why use Ometa Labs to do molecular networking?

At Ometa Labs, we specialize in creating tools that allow you to streamline your metabolomics workflows so that you can spend less time looking at things you’ve already seen and more time discovering the hidden potential in your data. Our tools come with built-in visualization techniques to layer metadata directly onto your networks without extra software - and we’d love to show you what we can do.

 

Molecular Networking FAQs

What happens if none of the nodes in my sub-network are annotated? 

Unfortunately this does happen, and can make annotation directly from the network difficult. In this case, you can still manually inspect the spectra and layer metadata on these nodes. If any of these compounds merit further investigation, you can try out our spectral search tools to get more information. 

Does distance between the nodes matter?

No, the distance between the nodes (the length of the edge) does not matter. However, edges between nodes will have different cosine similarity scores. You can check out these cosine scores and manually inspect the spectra from two connected nodes by simply clicking on an edge in our molecular networking dashboard. 

I have two nodes that are connected to the same node, but not connected to each other. Will they also be similar? 

If there is no edge between two nodes, that means their cosine similarity is below the cutoff value. If that cutoff is 0.7, their similarity could be as high as 0.69, but it could also be much lower. We do provide the option to redraw edges within a sub-network based on a new cutoff value without having to re-run the entire network. With that said, generally compounds within the same molecular family will share some structural similarity (eg, they may all be sugars), but there can still be significant variance within the same sub-network. 

I can’t see my standard in the network! Why isn’t it there?

There could be a couple issues at play here:

  1. By default, the network only draws nodes that are connected to at least one other node. If you’ve added an internal standard that isn’t structurally similar to any of the compounds in your sample, then it may be what we call a “singleton” (a node that exists in your data but isn’t connected to any other spectra). In this case, you can change your network settings to draw singletons as well. 

  2. If you’re running a reference standard in a single sample (versus an internal standard spiked into all your samples), that standard may only show up in a single file. To reduce network complexity, default molecular networking will only create nodes for spectra found in at least two samples. You can also change this behavior in the networking settings.

 

Want to learn more?

Have more questions? Contact us.

Previous
Previous

Library Matching

Next
Next

Classical vs. Feature-Based Molecular Networking