Data brokers and the self-as-commodity

Data brokers are companies that collect, aggregate, analyze, and sell personal information about individuals without those individuals' direct knowledge or meaningful consent. The industry is enormous — by the early 2020s, the largest data brokers held records on virtually every adult in the United States and billions of people worldwide — and largely invisible. Unlike social media platforms, search engines, or financial institutions, data brokers have no direct relationship with the people whose data they trade. They are pure intermediaries in a secondary market for personal information, operating in the gap between the systems that originally collect data and the buyers who use it to make decisions about people. At collective scale, the data broker industry instantiates a specific social arrangement: the conversion of personhood into tradeable commodity, conducted systematically, at population scale, without the participation or oversight of the people being commodified.

The self-as-commodity framing is not merely metaphorical. Data brokers produce what can accurately be called person-products: aggregated dossiers that attempt to capture the economically and behaviorally relevant dimensions of an individual's identity. These products are sold to advertisers targeting purchasing behavior, to employers screening applicants, to insurers assessing risk, to landlords evaluating tenants, to political campaigns identifying persuadable voters, to law enforcement building investigative leads, and to an expanding range of other buyers whose decisions materially affect the subjects of the dossiers. The person whose data is aggregated and sold is simultaneously the product and the person affected by decisions made using the product — a peculiar double position that has no clear analog in other commodity markets.

This arrangement has deep consequences for how personhood is constituted and experienced at collective scale. Traditional liberal theory imagines the self as the primary author of its own social presentation — choosing what to reveal, to whom, in what contexts. Contextual integrity theory formalizes this intuition, holding that appropriate information flow follows the norms of the context in which information was originally shared. Data brokerage violates contextual integrity systematically and at scale. When information shared with a pharmacy (prescription records), a mobile carrier (location history), a loyalty program (purchase patterns), a property records system (address history), and social media platforms (expressed preferences) is aggregated into a unified commercial profile, the resulting product contains information that the individual never consented to share in that aggregated form, never intended for commercial use, and cannot inspect, correct, or contest.

The aggregate profile is not merely a collection of facts. It is an interpretation — a predictive model of the subject, scoring them on dimensions of creditworthiness, purchase likelihood, health risk, political persuadability, insurance exposure, and behavioral predictability. These scores shape the offers extended, the prices charged, the opportunities presented, and the rejections delivered across multiple domains of life. The person whose aggregate score places them in a high-risk insurance category pays higher premiums. The person whose purchase history marks them as financially stressed sees different credit offers than their wealthier counterpart. The person whose location data reveals regular visits to a fertility clinic may find that information sold to employers, marketers, or insurers. The data broker's product does not merely describe existing reality — it actively produces social reality by feeding into the algorithmic decision systems that allocate resources, opportunities, and risks.

The political economy of data brokerage reflects a fundamental misalignment of incentives. The companies that originally collect data — platforms, retailers, utilities, governments — face limited economic incentive to minimize collection, because data has value in secondary markets even when it has no operational necessity for the primary service. Data brokers face no direct accountability to the individuals whose data they hold, because those individuals are not their customers. The buyers of data broker products — advertisers, employers, insurers — face accountability for the decisions they make using brokered data only insofar as anti-discrimination law constrains them, which is uneven and often difficult to enforce when the discrimination is embedded in algorithmic scoring rather than explicit categorical decision-making. The person who is the subject of all this activity has no legal relationship with most of these entities in most jurisdictions and therefore no formal standing to contest decisions made using profiles they have never seen.

Law 4's stewardship imperative demands that those who hold and trade in personal information bear accountability proportionate to the power they exercise over subjects' lives. The data broker industry has, until recently, largely operated outside meaningful regulatory accountability. The California Consumer Privacy Act (2018) and its 2020 amendment (CPRA) established registration requirements for data brokers and created limited consumer rights to opt out of data sale and request deletion. Vermont created the first state-level data broker registration law in 2018. The EU's GDPR applies to data brokers operating in Europe, requiring lawful bases for processing and limiting secondary use without consent. These are meaningful but partial interventions: registration requirements create transparency about industry existence without addressing the underlying practices; opt-out frameworks place the burden of protection on individual consumers navigating complex opt-out processes across thousands of individual broker relationships.

What the data broker industry ultimately represents is the privatization of population surveillance — the transfer of the state's historical capacity to track and profile populations to commercial actors whose accountability is governed primarily by market incentives rather than constitutional constraints. Law 2 (the law of pattern and correspondence) is implicated because aggregate profiles encode social patterns — patterns of race, class, geography, health — that translate individual characteristics into group-level predictions, producing the discriminatory feedback loops of algorithmic bias at scale. Understanding the self-as-commodity at collective scale means confronting the fact that personhood has been made legible to commercial power in ways that individual persons cannot see, contest, or govern.