Probabilistic matching infers that multiple devices belong to the same user by analyzing shared behavioral signals — IP addresses, location patterns, browser behavior — using statistical modeling.
Quick Answer
Probabilistic matching infers that multiple devices belong to the same user by analyzing shared behavioral signals — IP addresses, location patterns, browser behavior — using statistical modeling.
Probabilistic matching extends device graph reach 2–3x over deterministic matching with false match rates of 5–15%
Post-iOS 14, probabilistic matching fills the identity gaps left by restricted IDFA availability
Use incrementality testing (not view-through attribution) to validate probabilistic match-based campaign performance
Key Takeaways
Probabilistic matching extends device graph reach 2–3x over deterministic matching with false match rates of 5–15%
Post-iOS 14, probabilistic matching fills the identity gaps left by restricted IDFA availability
Use incrementality testing (not view-through attribution) to validate probabilistic match-based campaign performance
How Probabilistic Matching Works
Probabilistic matching is a device graph methodology that identifies likely relationships between devices without direct user authentication. Algorithms analyze patterns of shared signals — devices connecting from the same IP address, exhibiting similar browsing behavior, accessing the same apps, or frequenting the same physical locations — and assign probability scores to device relationships. For example, if a smartphone and a laptop consistently connect from the same home IP, visit the same news sites, and use the same time zone, the system assigns high probability they belong to the same person. Match rates for probabilistic models are higher than deterministic (sometimes 2–3x the reach), but accuracy varies — false match rates of 5–15% are common in consumer markets.
Why Probabilistic Matching Matters for B2B Marketing
Probabilistic matching has become more important in the post-iOS 14 era, where deterministic mobile identity signals are restricted. For B2B marketers, probabilistic matching enables cross-device reach extension beyond the subset of prospects who are logged in across devices. It's particularly useful for reaching prospects at the top of the funnel — where broad reach matters more than individual-level accuracy — and for extending reach in markets with limited deterministic data availability.
Probabilistic Matching: Best Practices & Strategic Application
When using probabilistic matching, validate performance through incrementality tests rather than view-through attribution, which can be inflated by false matches. Layer probabilistic cross-device reach with contextual and intent-based targeting to improve relevance. Always use frequency capping at the probabilistic cluster level (not individual device level) to prevent over-exposure when the matching model is uncertain. Regularly audit device graph vendors by testing known device pairs against their match rates.
Agency Perspective: Probabilistic Matching in Practice
MV3 uses probabilistic matching as a reach extension layer on top of deterministic foundations. In B2B campaigns, we set conservative frequency caps on probabilistic audiences (2–3 exposures/week vs. 4–5 for deterministic) to account for false match risk. This approach extends reach by an average of 40–60% beyond deterministic-only targeting while maintaining campaign efficiency within acceptable accuracy thresholds.
Probabilistic matching infers that multiple devices belong to the same user by analyzing shared behavioral signals — IP addresses, location patterns, browser behavior — using statistical modeling.
Deterministic matching achieves 99%+ accuracy for confirmed device relationships (linked via logged-in email), but has lower reach. Probabilistic matching typically achieves 85–95% accuracy at the household level and 75–90% at the individual level, with accuracy varying by market and data richness. Most enterprise applications use hybrid models that favor deterministic matching where available and fill gaps with probabilistic inference.
GDPR compliance for probabilistic matching requires a valid legal basis — typically user consent or legitimate interest. Under GDPR and ePrivacy Directive, probabilistic device fingerprinting without consent is a gray area that regulators have scrutinized. Most reputable device graph providers operating in the EU rely on consent-based signals (TCF 2.0) rather than pure fingerprinting. Always work with vendors who provide documented legal basis for their matching methodology in each jurisdiction.
Use probabilistic matching when you need maximum audience reach — particularly for awareness campaigns where some false matches are acceptable and incremental exposure matters. Choose deterministic matching for retargeting, frequency-capped campaigns, and CRM-matched audiences where individual identity accuracy is critical. For most B2B campaigns, a hybrid approach using deterministic as the anchor and probabilistic for reach extension delivers the best performance.
MV3 Marketing helps B2B companies apply these strategies to drive measurable pipeline growth. Our team executes ppc management for technology, SaaS, and professional services companies.
ID used to identify users for 24 hours after last activity
24 hours
_gat
Used to monitor number of Google Analytics server requests when using Google Tag Manager
1 minute
_gac_
Contains information related to marketing campaigns of the user. These are shared with Google AdWords / Google Ads when the Google Ads and Google Analytics accounts are linked together.
90 days
__utma
ID used to identify users and sessions
2 years after last activity
__utmt
Used to monitor number of Google Analytics server requests
10 minutes
__utmb
Used to distinguish new sessions and visits. This cookie is set when the GA.js javascript library is loaded and there is no existing __utmb cookie. The cookie is updated every time data is sent to the Google Analytics server.
30 minutes after last activity
__utmc
Used only with old Urchin versions of Google Analytics and not with GA.js. Was used to distinguish between new sessions and visits at the end of a session.
End of session (browser)
__utmz
Contains information about the traffic source or campaign that directed user to the website. The cookie is set when the GA.js javascript is loaded and updated when data is sent to the Google Anaytics server
6 months after last activity
__utmv
Contains custom information set by the web developer via the _setCustomVar method in Google Analytics. This cookie is updated every time new data is sent to the Google Analytics server.
2 years after last activity
__utmx
Used to determine whether a user is included in an A / B or Multivariate test.
18 months
_ga
ID used to identify users
2 years
_gali
Used by Google Analytics to determine which links on a page are being clicked