Inside FoundryOps: FoundryOne™ & the Foundry Graph™

The accuracy you see isn't magic—it's engineering. A high-performance entity-resolution engine + a living open-data graph deliver explainable, privacy-safe precision across every FoundryOps product.

What is FoundryOne™?

FoundryOne is our B2B-tuned entity-resolution engine powering Google Sheets, Gremlin CLI, and the FoundryOps API. Built for 10M+ rows, multi-core performance, and zero-guesswork matching with reason chips.

Real-World Example

When matching IBM vs International Business Machines vs ibm.com, FoundryOne returns all three with explainable reason chips showing domain match + parent company signal + alias match.

Performance at scale

Multi-core, 10M+ rows, smart blocking for speed and recall.

→ Run 10M-row dedupes locally without cloud costs

Explainable accuracy

Reason chips, domain & family signals, and transparent scoring.

→ Audit every match for compliance

Privacy-safe architecture

Open data sources, no PII resale, in-memory processing.

→ No vendor lock-in to data brokers

Why Multi-Algorithm Matching Matters

Unlike tools that rely on a single matching technique (usually fuzzy string matching), FoundryOne combines multiple specialized algorithms and picks the right one for each data type. Here's why that matters:

Example 1: Abbreviations
Your CRM has:
"IBM Corp""International Business Machines"
Generic fuzzy: 30% similar → No match
FoundryOne: Token acronym → 95% match

→ Don't waste hours manually merging obvious duplicates

Example 2: Typos
A rep enters:
"Saelsforce Inc"
(swapped 'e' and 'l')
Generic fuzzy: Different token → No match
FoundryOne: Levenshtein + phonetic → Match

→ Dirty data doesn't break your CRM hygiene

Example 3: International Names
You have:
"Société Générale""Societe Generale"
⚠️Generic fuzzy: 90% similar → Maybe
FoundryOne: Unicode normalization → Exact

→ Global companies with non-English names work correctly

Example 4: Subsidiaries
You see:
"GE Healthcare""General Electric Company"
Generic fuzzy: Low overlap → Different
FoundryOne: Domain + Graph hierarchy → Match

→ Attribution shows the real parent company

Example 5: Domain Data
You have:
"Apple Inc""apple.com"
Generic fuzzy: No text overlap → No match
FoundryOne: Domain extraction + Graph → Match

→ Web traffic and CRM data unified automatically

The Key Difference

Every match includes reason chips showing exactly which algorithms fired, so you know why FoundryOne made each decision.

No black boxes. No guesswork.

ScenarioSingle-Algo ToolFoundryOne Multi-AlgoWhy It Matters
"IBM Corp" vs "International Business Machines"❌ Low similarity (30%)✅ Token acronym match (95%)RevOps teams don't waste hours manually merging obvious matches
"Saelsforce" (typo) vs "Salesforce"❌ Treated as different✅ Levenshtein + phoneticDirty data doesn't break your CRM hygiene
"Société Générale" vs "Societe Generale"⚠️ Accent mismatch✅ Unicode normalizationGlobal companies with non-English names work correctly
"GE Healthcare" vs "General Electric"❌ Different entities✅ Parent-child hierarchyAttribution reports show the real parent company
"apple.com" vs "Apple Inc"❌ No overlap✅ Domain → company lookupWeb traffic and CRM data can be unified

What is the Foundry Graph™?

A continuously updated, privacy-safe corporate graph from public sources (Wikidata, official registries), mapping domains, brands, and hierarchies. Updated nightly with full audit trails—every data point is traceable to its source. No PII, no resale, no black boxes.

4.9M+
Company profiles
3.2M+
LEI-verified entities
Direct from GLEIF
290K+
Parent/subsidiary links
69K+
Domain mappings
230 countries
Global coverage represented
20+ languages
Multi-language labels via Wikidata
Millions monthly
Profile lookups processed

Why Transparency Matters

Unlike opaque data brokers who won't tell you where their data comes from, every attribute in the Foundry Graph is traceable to Wikidata or official registries. Full audit trails, privacy-safe first-party crawls, and nightly automated updates—no black boxes, no mystery data.

Traceable sources: Wikidata + public registries
Nightly rebuilds with SPARQL reconciliation
Full audit trails for compliance
Coverage snapshots published

What This Means for Your Matching

  • Better multi-branch matching: 290K+ parent/subsidiary links mean FoundryOne can collapse subsidiaries into a single enterprise record for ABM rollups
  • Domain family intelligence: 69K+ domain mappings significantly improve domain-based matching accuracy
  • Regional expansion: Strongest coverage in US/EU with rapid APAC growth—230 countries represented

How they work together

Source Data
Your customer lists, CRM records
FoundryOne
Match rules + scoring
Foundry Graph
Domains/brands/hierarchy
Results
Explainable matches with reason chips

Built for a Global Audience

Most fuzzy matching tools break when you throw Chinese, Japanese, or Arabic text at them. FoundryOne was engineered from day one to handle the complexity of international data—without corruption, data loss, or character mangling.

Full Unicode Support

Match and deduplicate data in 50+ languages including Chinese (中文), Japanese (日本語), Korean (한국어), Arabic (العربية), Cyrillic (Русский), Hebrew, Thai, and all Latin scripts.

No character restrictions—works with emoji, special characters, and mixed scripts.

Smart Algorithm Switching

FoundryOne automatically detects language script and switches to optimized algorithms. CJK_DICE2 for character-based languages, phonetic matching for Latin alphabets.

Most tools use Levenshtein/Jaro-Winkler designed for alphabetic languages—they fail on CJK.

International Domains

Full support for IDN (Internationalized Domain Names) and Punycode. Match Russian .рф domains, Chinese .中国 extensions, and all Unicode TLDs.

Script-aware normalization preserves non-Latin characters.

Google Sheets Integration

All formulas (FMATCH, FDEDUPE, FENRICH) work with international characters. UTF-8 throughout—reading and writing preserves Unicode integrity.

API communication maintains character integrity across all operations.

Salesforce Integration

Full Unicode preservation in push/pull operations. SOQL queries handle international characters correctly. Multi-encoding CSV import support.

Tries UTF-8, Latin-1, Windows-1252 to ensure your data imports cleanly.

230 Countries

The Foundry Graph represents companies from 230 countries with multi-language labels via Wikidata. Strongest coverage in US/EU, rapid expansion in APAC.

Over a third of profiles include country tags for regional filtering.

Real-World Examples

Chinese Company Names
阿里巴巴集团Alibaba Group

CJK_DICE2 algorithm matches character bigrams, no data loss

Japanese Domain Matching
ソニー株式会社sony.jp

Domain extraction + graph lookup unifies Japanese/English variants

Arabic Script Support
شركة أرامكو السعودية✓ Matched

Right-to-left scripts preserved, no character corruption

Cyrillic Company Names
ПАО Газпромgazprom.ru

Cyrillic-to-Latin domain mapping via Foundry Graph

✅ Complete Unicode Preservation

Full Unicode support across match, dedupe, transforms, and Salesforce sync — no caveats, no character corruption. We've engineered every transform to preserve international characters end-to-end.

Script-aware normalization: Strips Latin diacritics only when needed ("José" → "jose"), preserves CJK/Cyrillic/Arabic intact
Multilingual corporate markers: Handles 株式会社, 有限公司, 控股, ООО, GmbH without data loss
NFKC normalization: Consistent Unicode handling across all transforms
End-to-end preservation: Input CJK/Cyrillic/Arabic/Hebrew → stays intact through all operations
Real Examples (Now Fully Supported):
"Société Générale""societe generale"
"株式会社メルカリ""メルカリ"
"腾讯控股有限公司""腾讯"
"ООО \"Рога и копыта\"""рога и копыта"

Note on classification: Title/industry/job function detection uses English keyword matching—it won't corrupt foreign strings, but won't infer non-English semantics. Our UI is currently English-only, but the core engine processes data in any language.

Why This Matters vs. Competitors

Most Fuzzy Matching Tools
  • Use Levenshtein/Jaro-Winkler (designed for alphabetic languages)
  • Fail on CJK character-based languages
  • Strip non-ASCII characters "for safety"
  • No script detection or algorithm switching
FoundryOne
  • CJK_DICE2 algorithm for character-based languages
  • Script detection + automatic algorithm switching
  • Full Unicode preservation in matching pipeline
  • Multi-language Wikidata labels in Foundry Graph

Benchmarks & Proof

95%+
Top-1 precision on real B2B datasets
10M rows
Processed in minutes on modern laptops
30–40%
Fewer false positives vs. generic fuzzy matching
3–5×
Faster than naive string matching (and more accurate)
<1 second
Matching response time for 99% of queries
20+ languages
Multi-language support via Wikidata
Nightly updates
Graph refreshed from public sources

Privacy & Trust

Open-data sourcing

Transparent provenance from public datasets like Wikidata

No PII resale

No data brokerage, no selling your customer data

In-memory processing

Regional controls and secure processing

Auditability

Explainability for every match with reason chips

Experience FoundryOne™ Today

Start using FoundryOne-powered matching in Google Sheets, or explore the Gremlin CLI and API.