chapter

Data Liberation

Claudio Agosti

Technopolitics was happening before my eyes, and the ‘I told you so’ wasn’t helping anyone. Even though a small group of techno-paranoids discouraged activists from using Facebook and Instagram, their promise of an instant audience and the feeling of increased reach easily captivated even politically savvy friends, relatives, and everyone who felt they had something to say. Now imagine their faces when their political events were taken down. Or when expressing solidarity with a cause was flagged as hate speech. Imagine falling into such a trap willingly and then realizing you can’t leave, because of the so-called critical mass. Technopolitics is happening, and we’re not just passive spectators. We produce data, and that data fuels what here is referred to as surveillance-based social media platforms: large, centralized social media services built on profiling, targeted advertising and attention extraction. Since this data comes from us, in theory, we could control it, but that’s not happening. Is society doomed into Zuckerland? Can we reverse this? This article tells a story based on my first-hand experience between 2019 and 2023 to attempt this. It mixes theory and practice to promote an innovative tactic of subversion. We’ll start by analyzing the reasons for this worldwide Stockholm syndrome, in which even if you hate a platform, you can’t leave it.

Are people really enslaved in so-called dopamine addiction? Are our friends hostages of platforms, forcing them to maintain an account to not feel isolated? Or are valuable sources, such as media and public institutions, forced to be there because of the promise of visibility? All of the above.

We’ll discuss one of the most problematic powers of surveillance-based media platforms. Then address my attempts to ‘break free’ by using technological means, and deal with the hard question: how much can a technical solution actually solve the problem of a society that doesn’t question the politics embedded into digital platforms?

The ultimate prize of reading this piece is to inspire you to apply a grassroot form of resistance, a mixture between technology and politics, full of challenges and opportunities. In this piece we’ll dissect it all.

Problem identification

Having amassed billions of users through convenient services, these platforms are difficult to leave because of the network effect: everyone and everything is on the dominant network, so departing means losing contact, audience and influence. As user+consumers we are trapped inside surveillance-based social media platforms; the social pressure to remain comes at the cost of autonomy. You stay, but your feed fills with AI-generated sludge, carefully tuned to exploit the weak spots inferred from your past behavior.

As Tristan Harris¹ and others have popularized, the attention economy runs on continuous experiments: algorithms test millions of variations to see what keeps you scrolling a little longer, turning your nervous system into the target of an optimization problem. That is the core of our social dilemma: we need the content and connections that live there, but accessing them is wrapped in psychological traps. Staying outside means losing valuable information or contact with loved ones; staying inside means being constantly subject to techniques designed to extract time, data and attention.

‘If people don’t want to stay on Facebook nobody is forcing them.’

This is what a Facebook spokesperson-lobbyist once told me at a human rights conference—one of those events the company periodically visits to feign concern for minorities and other token causes. It was RightsCon 2017. Independent social media already existed, so I replied (paraphrasing; our younger selves always sound sharper in hindsight):

That’s not really true, is it? Sure, nobody forces us explicitly, but Facebook designed the entire environment so leaving feels like cutting yourself off from everyone you know. You built a system where all our friends, groups, events, and memories are trapped, and then you shut down every API that once allowed developers to build bridges out. There’s no real portability. You can’t take your network with you, and that’s the real lock-in. People want alternatives; they just can’t afford to lose their social connections. Meanwhile, the free software community and the open protocols research grows, the Fediverse actually allows that kind of interoperability, where users and platforms can talk to each other freely. But Facebook made sure that freedom was impossible, not because it’s technically hard, but because it’s bad for their business model.

As a historical aside, one of Facebook’s winning moves at the beginning, when it was competing with Myspace for critical mass, was a technical hack to lower switching costs. If you want to know more, an article from Cory Doctorow² explains it well.

Let’s pretend that a bit of hacking can solve everything

In this article, I would like to suggest a radical approach: to copy the content from these platforms and republish it in free networks such as the Fediverse. This would enable people to find content more easily and potentially reduce switching costs.

I’ll refer often to the term ‘scraping’ which has recently been co-opted by AI companies to justify their massive data theft. But this is not what web scraping generally means. Yet, the definition of Web Scraping³ is more neutral (paraphrased from Techopedia):

Scraping (or web scraping / data scraping) is the technique by which a computer program automatically extracts data from content originally intended for human reading (e.g. HTML pages, PDFs, websites) and converts it into structured, machine-readable formats (such as JSON, CSV, or database tables) for processing, storage, analysis, or reuse.

Scraping is the technique allowing us to collect data. Instead of harvesting for a profit-seeking purpose, we’ll do it for the public benefit. Think of a cultural institution that has been publishing its programme on a Facebook page for years, with no proper archive anywhere else. Or that incredibly useful resource whose articles are wrapped in cookie walls, tracking scripts, and attention-grabbing ads. Or a grassroots collective that coordinates protests or mutual aid only through an Instagram profile that could vanish with one moderation mistake. These are the kinds of situations Data Liberation is concerned with: public-interest content that exists, but is locked inside the enclosures of Meta/Google/ByteDance/etc.., aka surveillance-based social media platforms.

Since social media data is the reason why people remain in the surveillance-based platforms, if this data goes into a more ethically maintained network, maybe this can spare profiling and advertisement to that friend of yours?

Data Liberation is a concept to overcome the lack of critical mass and weak network effects. The idea is to free some content, from being only an asset of the surveillance-based media platform to becoming available on free social media networks too. An adversarial bridge. It’s a bet: many things could go wrong, but below, we’ll explore all the technical, ethical and legal complexities we must consider.

How to scrape without losing information

The most basic way to copy something is to take a screenshot. But a screenshot is almost useless for meaningful navigation: you cannot search, filter or build any interface on top of it. To make liberated content really usable, we need structure, and that means metadata: labels and context that describe, organize and connect pieces of data.

2.2Agosti_image1_1.png — Fig 1: Example of a social media post annotated to show how many layers of metadata surround a single piece of content: profile name, publication time, hashtags, profile links, external link, alternative text, album preview, (an AI-generation typo!), and interaction counters.

On social networks, almost everything you experience is driven by said metadata: names and usernames you can click, timestamps, links, hashtags and mentions, counts of likes, comments and shares. In the Facebook post shown in the figure, each balloon simply points to one of these markers. If we only capture the raw text and image, and do not also extract this surrounding information, we lose most of what makes the post navigable.

For people used to privacy debates, metadata is usually what enables surveillance and behavioral targeting. An article by David Golumbia⁴ helped me see how deep this goes: what we think of as ‘personal data’ is in fact a stack of layers. There is the data you explicitly give, the data the platform observes about your behavior, and then the extra categories it infers by analyzing both—interests, risk scores, predicted attributes. In that sense, a lot of what matters most about you online is metadata created around your actions.

We can apply the same lens to social media content. A post is the base layer; around it, platforms attach automatic metadata (who posted, when), creator-supplied tags and links, social signals such as replies and reshares, and, above all, inferred metadata: rankings, relevance scores and predictions about who should see it. This invisible layer is what lets them decide what surfaces in feeds and searches.

If Data Liberation had access to all these layers, we could build indexing, filtering and discovery tools that respect people’s agency instead of the platform’s business model. Yet today the most powerful parts (the inferred metadata) remain locked away as proprietary assets. Inferred metadata is treated as the company’s property, even though they are built on top of our behavior and histories. So far, when people exercise their data-access rights, surveillance-based social media platforms never return this inferred layer to the user. This is so, even though such inferred metadata must be considered personal data subject to legal protection under the GDPR. But at the moment we do not even know the full extent of this inferred data, let alone how to reclaim it. How can we change that? Glad you asked, follow me down the rabbit hole.

Strategic litigation

To ensure the enforcement of our rights, we or civil society actors can initiate legal proceedings to establish a legal precedent. This approach is known as strategic litigation because the case itself may not be particularly significant, yet the legal principle it sets carries major significance for civil society. Between 2019-2022, I led such a legal action seeking access to inferred metadata (by proving that users were being profiled through a protected category of personal data, which would oblige the company to disclose such information).⁵ But this approach was unsuccessful. It took years and, even though that fight still needs to be won by civil society, I had enough time to reconsider our needs more carefully and put that approach into discussion.

If it was true that we need rich metadata to offer a high-quality experience with liberated content, I found myself torn between a question of principle and a question of strategy. In principle, people should be able to obtain all their data back, including inferred profiles and opaque classifications. Strategically, though, even if we did gain access to that corporate metadata, building on top of it would mean importing their categories, their biases and their worldview into our own tools. It made me realize that metadata is not just a technical detail: it is where power sits, and whoever generates it decides what can be seen, found and valued.

We need our own metadata: a smart scraping mechanism that fetches, clones and enriches the content. This would combine automation, existing classification and manual intervention. Luckily I had the safest fallback from any legal failure: a battle grade technology.

Homework done, or, why regulations won’t save us

In 2016, I started a project called Tracking Exposed.⁶ Its goal was to prove that social media algorithms were harmful and explore ways to challenge them. At the beginning I worked with academia, published peer-reviewed research, and taught students how to recognize algorithmic bias. It worked out, the project has since evolved into two specialized spin-offs: AI Forensics⁷ and Reversing.Works.⁸

From a niche research area, we found a way to use evidence of algorithmic-driven harms to prove GDPR violations and put legal pressure on exploitative social media platforms. The debate on the Digital Services Act (DSA) was informed by the difficulties we encountered with scraping and reverse engineering algorithms.

But let’s be realistic, are regulation and strategic litigation alone really going to resolve this issue? Hardly. It makes corporate social media spend more money, forces them to comply with laws that were designed too many years too late, and meanwhile new exploitation dynamics are deployed over us.

Plus, the influence of the trillion-dollar industrial complex that is surveillance capitalism cannot be unseated by fines. Users will continue to be trapped until we take real action to break up monopolies. Is this going to be a gift from above? I don’t think so, or at least it shouldn’t be our only hope. If we want a future beyond corporate control, we will have to save ourselves.

Over the years, I have built a set of scrapers for Facebook, YouTube, TikTok and other platforms in Tracking Exposed, which are designed to study how their algorithms behave in practice. When the director of the documentary Nothing to Hide⁹ reached out to me with a new project on alternative networks such as the Fediverse, I realized that it was the right time and place to experiment with a more proactive solution: copying public-interest content out of surveillance-based media platforms into autonomous spaces. That’s when the concept of Data Liberation began to be tested.

The Disappear Documentary, Berlin Clubs, and Librevents

Around 2020, two forces converged: a documentary project called Disappear was underway (short version on YouTube ‘Under the radar: covering your online tracks’),¹⁰ aiming to explain the ecosystem around digital autonomy and spotlighting the Fediverse (more details below). At the same time, Berlin’s underground club scene was running into a growing problem: event censorship and shadow banning on Facebook. For communities that thrive in the margins, it was a threat to their existence.¹¹

By ‘events’, we mean announcements with a date, location, and title. For example: ‘Cheshire-cat-parade starting from 15:00 in Wonderland Platz’. The organizer’s goal is to be as visible as possible, especially to your own people. And to do that, you need to follow them where they already are: on Facebook. A deadlock. Nobody migrates where there’s no content and not the right audience/followers. The core of the problem for event organizers has always been ‘If your party starts disappearing from Facebook, do you still exist?’

We established Librevents to liberate the event announcement of the Berlin clubscene from Facebook by publishing them on the Fediverse.

Fediverse, and the crucial role of ActivityPub

Let me do a short detour here to explain what the Fediverse is. Its complexity is one of the challenges of empowering Data Liberation. The Fediverse¹² is a network of interconnected platforms rather than one single website built on ActivityPub. ActivityPub¹³ is an open, decentralized protocol that allows different platforms to exchange social data and talk to each other without asking permission from a central gatekeeper. One platform is Mastodon,¹⁴ probably the best-known example: a Twitter-like microblogging platform that runs on ActivityPub. Mobilizon¹⁵ is a federated events and groups platform, designed as an ethical alternative to Facebook Events, so that communities can organize and share events without relying on a centralized platform. Gancio¹⁶ is a community events calendar used by local groups and social centers to publish what happens in their neighborhoods.

These services use ActivityPub under the hood to let their different instances exchange posts, events, and updates. These are only a few examples: many other software projects use ActivityPub to rebuild social media in less exploitative ways, or to invent entirely new forms of online social interaction.

All these services are made of many independent servers, called instances. Together, these instances form the distributed and decentralized Fediverse: no single server owns, sees, or controls the whole network. This nature of the Fediverse gives us the control and agency needed, but makes it harder to communicate to a wider audience. As explained below, if anyone tries to convince you that we can make do without decentralization and distribution, that there is a BetterFacebook, aka a single product which is just better, you’re getting fooled.

This poster was one of our ways to summon hackers and activist to our adversarial interoperability hackathon; the website is now archived at librevents.vecna.eu, and you can find a few talks meant to capture the key reasoning behind that moment.¹⁷

But the tension remained: while many clubs wanted out of Facebook’s exploitative system, they still depended on its critical mass, and the users were there. The reach was too hard to walk away from.

2.2Agosti_image3.png — Fig 2: Poster for the first Librevents hackathon in Berlin, organised by Mobilizon Berlin and Tracking Exposed to prototype federated alternatives to Facebook Events.

Let’s get back to Berlin: the first design of Librevents was set out to liberate Facebook event data and inject it directly into Mobilizon, a federated event platform. We spun up a dedicated platform (mobilize.berlin)¹⁸ and started pulling in listings from Berlin clubs, protests and community events. Our goal: show people they didn’t have to live in Facebook’s chokehold to know what was happening in their own city.

We developed a guerrilla toolkit to make it work. One part was an automated browser that could log into Facebook, navigate events, and extract the data. The other was a browser extension volunteers could use to scrape events manually as they browsed. It was messy, but it worked. We mirrored Facebook’s public event ecosystem into the Fediverse, transforming an otherwise empty space into a live calendar of real, relevant happenings.

The response was encouraging. There were waves of new users every time Musk or Zuckerberg did something sketchy. Organizers saw original and mirrored events on Mobilizon and started using it.

If a dog can be as big as an elephant, it would look like an elephant.

Platforms in the Fediverse are designed with collective values at their core, for people, not profit. However, in a world dominated by platforms with vast user bases, they struggle under the weight of network effects and only attract small, self-selecting communities; a tiny minority who care about the technology. This is the first value-conflict: is that OK? Perhaps yes!

The small communities in the Fediverse are happy this way; they aren’t looking to scale up at any cost. Maybe that’s wise; everything Facebook-sized becomes as corrupt as Facebook.

Or is this time different? Is interoperability enough to stop the usual corruption that arrives when a platform becomes central? Probably not. As soon as a network concentrates attention, it attracts influencers, advertisers and data hoarders; whoever runs key pieces of infrastructure gains power, and attacks follow. Let’s call them the attention parasites. They are negative effects roaming into a network once it starts to gain social relevance.

The Fediverse has one real line of defense: diversity of software, instances and governance. Also, a protocol can still be attacked or co-opted, but an ecosystem of independent instances and multiple different logics is harder to capture, making the efforts of attention parasites less profitable and less likely to succeed. This is why I do not see Bluesky as a convincing alternative. Despite its decentralized branding, it reproduces Twitter-like social mechanics and still depends on central indexes. Backed by venture capital and operated by one company, it centralizes power and responsibility instead of distributing them. It’s not just a matter of semantics: I can see a diverse ecosystem growing there, but the question remains: when the aforementioned attention parasite shows up, will the solution be a multitude of answers taken by the community, or a carefully decided trade-off by the tech leadership? The second option is a sign of centralization, and that is the vulnerability.

Let’s welcome Adversarial Interoperability

In The Internet Con: How to Seize the Means of Computation,¹⁹ Cory Doctorow discusses ‘competitive compatibility’, or ‘com-com’. It’s the idea that users and developers should be free to make new things that interoperate with old ones, even if incumbents dislike it, because that’s how the open internet originally worked. He connects this to a larger political argument: that the tech giants’ dominance was built on com-com (e.g., early web browsers, email clients, and social tools), but once entrenched, they lobbied to criminalize the very practices that allowed them to rise.

And that’s why now we use the term ‘adversarial interoperability’, when someone makes a new tool or service that plugs into an existing one without permission. Librevents is a practical experiment of this concept, has proved that adversarial interoperability could bridge the centralized and the decentralized: like when your friend saves the paywalled article as a PDF and shares it over a group chat. The friend is doing a service to the group, but it’s a manual process of selection and sharing. For successful Data Liberation, we have to reduce the amount of clicks necessary to do it. That with just enough code, creativity, and a refusal to play by the rules, we can operationalize this.

That was just the experiment with a hypothesis, but it also opened the floodgates to a tangled mess of ethical dilemmas, technical nightmares, and community-building hopes we can’t just duct-tape over, even if the mission is as noble as dropkicking Elon Zuckergoogle’s face for the greater good.

First Lessons

The meeting at the hackathon mentioned earlier provided us with some practical feedback:

We need to make sure that clubs don’t feel that their invitation is being blindly copied. It’s the equivalent of a brand replication attack. (Trust issue?)
If someone asks a question, we need to ensure that someone else answers it, or at least informs people where they can find effective information. (Community support, transparency issue?)
We need to ensure that the link to the club is available correctly, so that even if we collect it from Facebook, people accessing it via Mobilizon can reach it directly. (Technical issue?) Based on this feedback, you’ll see a list of things that didn’t work out below. I’m listing them like an explorer who has returned from a difficult, harsh expedition, so that if new expeditioners decide to take the same trip, they will be better informed.

The medium (or, the interface?) is the message

If the user interface designer includes a question such as ‘Is this event wheelchair accessible?’ next to the ‘Location’ field every time a new event is created, this accessibility detail will be attached to your invitation. If you make this question mandatory, 100% of events will have this metadata.

Facebook does not ask about wheelchair access, and also does not care about many other accessibility or edge cases. It is optimized for the least friction; for ‘default thinking’. The setting for ‘normies’, and nobody should start a Data Liberation without mapping the missing metadata, otherwise, you’ll get close to…

‘Garbage in, garbage out’²⁰

Centralized social media platforms are overloaded with clickbait, native advertising and disinformation, among other things. Data Liberators should avoid copying such material unthinkingly. Uncritical automation risks degrading the quality of the federated network. If we replicate their noise, are we any different?

Returning to the wheelchair example above, there are three options:

~~You can just copy the event without caring about it~~. Nope. As I said, you shouldn’t repost garbage just because it’s easy. Check the source thoroughly before trusting their content with an automatic republication.
Manually curate the liberated data. This involves less automation and more individual selection. Check the place yourself and expand the description.
LLM Deepsearch. At the cost of some time, a modern, non-optimized computer might be able to run local models well enough to execute simple data enrichment functions. With the spread of LLM, both as a local or cheap service, it might be tempting, but this introduces bias and unpredictable behavior, so it should always be integrated with a human-in-the-loop. You should also expect heterogeneous behaviors. Not all events have the same details. Some organizers write this information in the attached picture, or in the description or even external link. This unpredictability is the first challenge. Where possible, this is solved with technology, but adding manual review steps and the ability to edit the event details was deemed the safest option. We knew that an extra step would incur a cost in terms of usability. Fewer events will be liberated, but this guarantees quality and a hand-picked decision.

Technical and Operational Challenges

Any Big Tech platform performs frequent UI updates and especially invests in anti-scraping tactics. It’s a cat and mouse game you should be ready to handle. But in addition to those, the feedback received taught us:

Consent dilemma: some organizers were happy to find their events on Mobilizon, but others were confused or uneasy (‘Who put my event here? Is this official?’). This required cross-platform outreach and a dose of community management.
And the conversation fragmentation. Facebook and Mobilizon both allow event comments, but they aren’t synced. Our solution: Clearly mark imported events and link back to the original. It wasn’t elegant, but it reduced the risk of inaccuracy.

We made sure the Mobilizon posts credited the original Facebook page or user who created the event, to the extent possible. This was important because we didn’t want to steal or misrepresent anything. A standardized way to attribute cross-posted content (like a federated equivalent of a retweet that clearly points back) would be very useful for such projects, and a protocol I am planning to look into is SOAP: A Social Authentication Protocol²¹ which might solve the issue of linking and authenticating content across social networks.

What makes data worthy of liberation?

It’s subjective; it depends on the barriers and/or surveillance costs imposed. For some people, a cookie banner that forces you to accept advertisements is a form of extortion. For others, it’s a business interest, not a public one. The following four points suggest some general rules:

Public interest value: Content with civic, cultural, or informational significance (e.g., events, alerts, publicly funded material), not trivial or purely private updates.
Trapped in exploitative systems: Data locked inside platforms that rely on surveillance, profiling, or coercive consent models.
Respect for creators: Liberation must include attribution, the possibility for the original creator to reclaim control, and exclusion of content where republishing could cause harm.
Extra: If you are the creator, you don’t need to liberate your own data. In that case you should ensure data portability to allow for multi-platform publishing (the ‘Related existing experiences’ section below talks about this).

If This Works, Who Stays in Control?

Data Liberation is a temporary bridge. The endgame is simple: a critical mass of people and creators migrate to the Fediverse, so we don’t have to liberate their content anymore. But to reach that point, copy isn’t enough, we need to support creators in reclaiming their agency.

That means helping them migrate both old and new accounts. Any liberated content should, by design, end up under the control of its original author. To do this right, every user, page or channel we mirror should have a one-to-one mapping in the Fediverse, anchored in a decentralized index²² that does not yet exist. There also needs to be a clear, accessible process to hand over control of these mirrored accounts and their archives when the creator is ready. And then there is the awkward question of sustainability: who funds this work? Running scrapers and federated infrastructure at scale costs time, servers, maintenance and legal resilience; there are only so many unpaid hours we can squeeze out of volunteers. If Data Liberation is to move beyond fragile experiments, it will need support from institutions and foundations willing to weaken social media monopolies rather than partner with them, and to treat this as infrastructure rather than a one-off campaign. For now, some of the technical challenges and the long-term funding model remain unsolved.

Dream big

We’re not stopping at events. The next move is to liberate everything that keeps people shackled to social media platforms: videos, pictures, articles. And here’s our advantage: the Fediverse isn’t just one platform, it’s an ecosystem. What if the next viral comedian never needs YouTube?

ActivityPub gives us that power! Photos flow through PixelFed²³, videos through PeerTube²⁴, longform through WriteFreely²⁵, and many others: one protocol, many front doors.

We don’t need to flood the system. Imagine one campaign at a time. It could be to liberate peer-reviewed science from closed portals (I would call this ‘the Aaron’, ofc)²⁶ or it could focus on public interest announcements. Or the latest meme collection. The content type isn’t important; it’s about the communities. Reach them. Show them that freedom from a profiling algorithm is possible. Curate together. Build something real.

Not everyone would be willing to fully leave the surveillance-based social media platforms. And when creators are not migrating or mirroring their channels, Data Liberation steps in. In the medium term, this is a bridge; in the long term, we need to ensure new content will be produced natively on free networks.

Federation Alone Won’t Save Us

Is it really a victory if the same systemic abuses are repeated on a different protocol? If the majority of people are still manipulated by powerful techno-billionaires, does it matter which app they use?

Switching platforms isn’t a solution if users don’t understand the politics of each Fediverse instance, the dynamics of exploitation, or how their attention is being harvested, because being federated does not automatically guarantee the good faith of the instance administrator.

This isn’t just a technical issue; it is political and educational. No protocol alone can fix the fact that, even if the switching cost to leave surveillance-based social media platforms were zero, most people would still experience it as merely changing service providers. For whole generations these platforms are where they become ‘informed’ and keep in touch, so the harm feels intangible, spread across interfaces and defaults. Our human stories aren’t equipped to recognize these struggles. It’s not just a private choice about which app, it’s a political choice that impacts society as whole.

Let’s play a nightmare scenario. Influencers or corporations begin running their own instances in the Fediverse, harvesting user data and attention under the guise of decentralization. Instead of one massive platform, we’d face many smaller centers of control, still profiting from surveillance and manipulation. If a business-oriented instance attracts enough users, keeps them connected to the broader Fediverse, and quietly feeds their activity into a Real-Time Bidding²⁷ market, then nothing fundamental has changed. We’ve simply decentralized the exploitation. And let’s not forget: an open-by-default network makes large-scale profiling easier for adversaries who wish to study and target individual behavior. This creates an expectation of high reliability – an expectation born out of top-notch infrastructure engineering – now applied to a system built on voluntary efforts.

The landscape has already begun to shift. After 2022, the Fediverse saw a significant surge in users, and new client/server implementations began to flourish. But this growth carries a looming threat: can a federated network be conquered—and if so, how?

As the aforementioned Cory Doctorow book points out:

The existence of for-profit servers in the Fediverse does not ruin the Fediverse (though I wouldn’t personally use one of them). The fact that multiple neo-Nazi groups run their own Mastodon servers does not ruin the Fediverse (though I certainly won’t use their servers). Not even the fact that Donald Trump’s Truth Social is a Mastodon server does anything to ruin the Fediverse (not using that one, either).²⁸

You should critically examine to what extent any supposed replacement for a social network would replicate the same social and technical dynamics. If the aim is engagement and virality rather than healthy discussion within a selected (or even open) community, then the protocol doesn’t matter: you already know how it goes.

Such complex dynamics offer different solutions, each with its own limits and perks. To know more, take a look at openportability.org²⁹ which helps migration from Twitter to other equivalent networks. POSSE³⁰ to Syndicate Elsewhere. To follow the regulatory struggle, look at the Data Transfer Initiative.³¹ How to sync your YouTube Channel to PeerTube,³² and the multi-purpose multi-channel multi-network Bidgy Fed.³³

One of the most notable tools in this space is Nitter,³⁴ which is a free alternative front-end for Twitter/X. It allows anyone to view public tweets without logging in, tracking, or ads. In 2024, Twitter/X blocked anonymous guest access³⁵ and many public Nitter instances stopped working,³⁶ citing unauthorized scraping and API violations. While Nitter achieves meaningful privacy and access goals, it doesn’t liberate data in the sense that this article means. It only mirrors it. The result is a read-only website. It doesn’t free the content toward autonomous, federated spaces. Still, it’s a worthy example of an adversarial effort to reduce surveillance, with it people can read tweets without exposing themselves to X’s trackers. Nitter restores access to public information without an X account, bypassing X’s dark-pattern ‘log in to view more’ walls. Nitter is also a good example because it is designed to empower self-hosting, since anyone can deploy their own front-end.

When Liberation Ends: Reflections on Sustainability and Continuity

The active phase of Librevents concluded at the beginning of 2023: this article gathers its lessons, and the libr.events domain now survives only as an archive.³⁷ Librevents was never meant to be a permanent service, but a proof of concept that a small, independent initiative can chip away at network effects by pulling content out of walled gardens.

Data Liberation excites but also worries me. We mustn’t give the impression that we are scammers cloning profiles. It is vital to respect the rights and efforts of creators who share knowledge as a commons: checking facts, providing sources, inviting critique. Their work is often exploited and underpaid, yet their knowledge ecosystems must be preserved, and any mirrored archive should ultimately return under their control. By contrast, streams of content optimized for hype, advertising and blunt propaganda do not deserve the same protection; Data Liberation projects should avoid reproducing that layer of the attention economy. Data liberators should not see themselves as massive scrapers and re-posters, but as archivists capable of deciding, curating, indexing, and filtering content. If possible, they should enrich the content with new metadata.

Data Liberation is a means, not an end. The long-term goal is a digital environment in which users are no longer confined to walled gardens, where interoperability is standard and monopolistic lock-in fades into the past. But every adversarial project is asymmetrical: on one side, thousands of well-paid, coordinated engineers and lobbyists reinforcing lock-in; on the other, a handful of hackers trying to subvert technology and create an opening.

Librevents relied on short bursts of volunteer energy and never grew into a stable contributor base, perhaps because it arrived too early. As the conflict over the digital sphere hardens, the next iterations of Data Liberation will need both better tools and stronger collective backing. At best, they can help a more aware society reclaim some power over its communication infrastructure.

From Brussels Effect to Washington Effect

Looking forward, the sustainability of Data Liberation cannot rest solely on activists. It requires distributed collaboration projects sharing code, federated infrastructure providing redundancy, and communities of practice that learn from each other’s partial failures.

If I still believed that robust, rights-based regulation from Brussels and Washington was politically within reach, I would probably present it here as the obvious solution. But we can no longer afford that illusion. The European Commission that once promised to rein in platform power is now busy hollowing out its own rulebook: a weak AI Act under constant pressure from corporate lobbyists to delay and simplify its obligations,³⁸ a Chat Control mandate that normalizes indiscriminate mass scanning and the end of anonymous communication,³⁹ Google’s new developer registration decree showing how a de facto gatekeeper can still unilaterally rewrite the terms of app distribution and threaten projects like F-Droid (wasn’t the Digital Market Act designed to prevent such a dominant position?).⁴⁰ Or even worse, the recent Digital Omnibus that openly rolls back core EU digital protections in the name of deregulation and competitiveness,⁴¹ and, most absurdly, a sovereignty debate in which opening more Big Tech data centers in Europe is sold as strategic autonomy while entrenching extractive control over knowledge, energy, and resources.⁴² Rather than a ‘Brussels effect’, what we are now witnessing looks more like a new Washington effect: under the banner of ‘innovation’ and ‘competitiveness’, the current Commission aligns itself with a US-centered security and industrial agenda, steadily watering down the very regulatory backbone that once made EU standards exportable. The Brussels effect was never a natural law; it depended on political courage and a willingness to confront corporate power, and both are in short supply. If there is any realistic source of hope, it may lie less in today’s regulators than in younger generations, who already act as global trend-setters: if it became cool not to live inside centralized platforms, their refusal could do more to shift the landscape than yet another round of timid European white papers.

This is why we cannot wait for benevolent regulators to deliver interoperability from above. The struggle is political: at the European level, to push whatever regulation does emerge to strengthen rather than sacrifice digital rights, and at the local level, to ensure that federated instances are governed by and accountable to the people who inhabit them and the communities they enable.

If Librevents has shown anything, it is that the battle over data is not only about code or law, but about imagination. To imagine a world where networks do not hold us hostage, where creators are not trapped in extractive systems, where interoperability is not adversarial but ordinary. Data Liberation is one tool in that struggle: a small hammer against a very high wall. But every strike leaves a mark.

(1) The Social Dilemma (dir. Jeff Orlowski, 2020), documentary film, Netflix, available at: http://www.netflix.com/title/81254224. ↩
(2) Cory Doctorow, ‘Adversarial Interoperability: Reviving an Elegant Weapon From a More Civilized Age to Slay Today’s Monopolies’, Deeplinks blog (Electronic Frontier Foundation), 7 June 2019, https://www.eff.org/deeplinks/2019/06/adversarial-interoperability-reviving-elegant-weapon-more-civilized-age-slay. ↩
(3) ‘What is Web Scraping? – Definition from Techopedia’, Techopedia, https://www.techopedia.com/definition/5212/web-scraping. ↩
(4) David Golumbia, ‘We Don’t Know What “Personal Data” Means’, Uncomputing (blog), 20 June 2018, https://web.archive.org/web/20181130023848/http://www.uncomputing.org/?p=1983. ↩
(5) Digital Freedom Fund Case Study, https://digitalfreedomfund.org/non-consensual-tracking-on-pornhub – Non-Consensual Tracking on Pornhub. ↩
(6) Tracking Exposed, https://tracking.exposed – inception project exposing platform tracking, spawning this and related experiments. ↩
(7) AI Forensics, https://aiforensics.org – nonprofit investigating influential and opaque algorithms. ↩
(8) Reversing Works, https://reversing.works – reverse engineering holding gig platforms accountable. ↩
(9) Nothing to Hide (dir. Marc Meillassoux and Mihaela Gladovic, 2017), documentary film, 86 min, Deep Docs Films, available at: https://deepdocs.eu/nth/. ↩
(10) France 24, ‘Under the Radar: Covering Your Online Tracks’, Reporters, television report, 15 April 2022, video, https://www.youtube.com/watch?v=YKQQM1KNXMU. ↩
(11) Joseph, ‘Speech about Community, Free Software, Decentralization, and mobilize.berlin’, The Digital Self (blog), 20 June 2021, https://blogs.fsfe.org/joseph/2021/06/20/speech-about-community-free-software-decentralization-and-mobilize-berlin/. ↩
(12) Wikipedia contributors, ‘Fediverse’, 27 November 2025, https://en.wikipedia.org/?oldid=1324475159, accessed 28 November 2025. ↩
(13) ActivityPub.rocks, https://activitypub.rocks – tutorials and test suite for interoperable ActivityPub implementations. ↩
(14) Mastodon, https://joinmastodon.org – main onboarding portal to Mastodon, flagship ActivityPub microblogging network. ↩
(15) Mobilizon, https://mobilizon.org – federated events and groups platform, ethical alternative to Facebook Events. ↩
(16) Gancio, https://gancio.org – federated community events calendar software using ActivityPub. ↩
(17) Librevents/Mobilize.Berlin, ‘Librevents “Liberating” Data From Big Tech’, Hacking in Parallel (///HiP-Berlin///) conference, Berlin, 28 December 2022, video, https://diode.zone/w/wy8nDtAQy7HRRtjdeJrLP7. ↩
(18) Mobilize Berlin, https://mobilize.berlin – Berlin events platform on Mobilizon, supporting Facebook migration for event organizing. ↩
(19) Cory Doctorow, The Internet Con: How to Seize the Means of Computation, London and New York: Verso, 2023. ↩
(20) ‘The concept that flawed, biased or poor quality (“garbage”) information or input produces a result or output of similar (“garbage”) quality’ from Wikipedia contributors, ‘Garbage in, Garbage out’, https://en.wikipedia.org/?oldid=1324425866, accessed 20 December 2025. ↩
(21) Felix Linker and David Basin, ‘SOAP: A Social Authentication Protocol’, arXiv preprint arXiv:2402.03199 [cs.CR], 5 February 2024, https://arxiv.org/abs/2402.03199. ↩
(22) Imagine there are two data liberators: one mirroring the New York Times YouTube channel and the other fact-checking all news about big pharma. One day, a video interview about insulin gets published on the NYT channel. Are they going to duplicate their liberation effort? That would be problematic in terms of resource optimization, and especially if the original author wants to take control of their content when migrating to the Fediverse. That’s why an index is necessary to allow the liberators to check if a resource (an URL) has already been liberated. Since we can’t trust a centralized index—hence the need for a decentralized one. (This simple footnote does not cover the issue of the reliability and trustworthiness of such an index). ↩
(23) Pixelfed, https://pixelfed.org – decentralized Instagram-style photo sharing; ActivityPub federation. ↩
(24) PeerTube, https://joinpeertube.org – federated, peer-to-peer video hosting; YouTube alternative. ↩
(25) WriteFreely, https://writefreely.org – minimalist federated blogging platform, ActivityPub-compatible writing. ↩
(26) The Internet’s Own Boy: The Story of Aaron Swartz (dir. Brian Knappenberger, 2014), documentary film, 105 min, available at: https://www.youtube.com/watch?v=M85UvH0TRPc. ↩
(27) Irish Council for Civil Liberties, 17 August 2020, ‘Evidences and explanation on how RTB works’, https://www.iccl.ie/digital-data/real-time-bidding-evidence/. ↩
(28) Doctorow, The Internet Con: How to Seize the Means of Computation. ↩
(29) OpenPortability, https://openportability.org – moves followers from X to Mastodon/Bluesky, exercising cross-network data portability. ↩
(30) IndieWeb, ‘POSSE’, https://indieweb.org/POSSE – publish on your own site, syndicate copies elsewhere for user-controlled portability. ↩
(31) Data Transfer Initiative, https://dtinit.org – nonprofit developing standards and tools for direct, interoperable service-to-service data transfers. ↩
(32) Frank Ring, ‘How to Sync Your YouTube Videos to PeerTube’, Fraxoweb blog, 12 August 2024, https://fraxoweb.com/how-to-sync-your-youtube-videos-to-peertube/ – tutorial for importing YouTube channels into PeerTube, enabling video content portability out of YouTube’s silo. ↩
(33) Bridgy Fed, https://fed.brid.gy – bridge connecting websites, the fediverse, and Bluesky, syncing profiles and interactions across protocols. ↩
(34) zedeus, ‘Nitter’, GitHub repository, https://github.com/zedeus/nitter. ↩
(35) Jon Brodkin, ‘Twitter Front-end Nitter Dies as Musk Wins War Against Third-party Services’, Ars Technica, 15 February 2024, https://arstechnica.com/tech-policy/2024/02/twitter-front-end-nitter-dies-as-musk-wins-war-against-third-party-services/. ↩
(36) Throwaway59378, ‘Why Is Nitter Being Discontinued Now, but Wasn’t Before?’, GitHub issue #1175, zedeus/nitter, 16 February 2024, https://github.com/zedeus/nitter/issues/1175. ↩
(37) Mirror of the Librevents website, discontinued: https://librevents.vecna.eu. ↩
(38) Cynthia Kroet, ‘Europe’s Top CEOs Call for Commission to Slow Down on AI Act’, Euronews, 3 July 2025, https://euronews.com/next/2025/07/03/europes-top-ceos-call-for-commission-to-slow-down-on-ai-act. ↩
(39) Patrick Breyer, ‘Reality Check: EU Council Chat Control Vote Is Not a Retreat, But a Green Light for Indiscriminate Mass Surveillance and the End of Right to Communicate Anonymously’, https://www.patrick-breyer.de/en/reality-check-eu-council-chat-control-vote-is-not-a-retreat-but-a-green-light-for-indiscriminate-mass-surveillance-and-the-end-of-right-to-communicate-anonymously/. ↩
(40) marcprux, ‘F-Droid and Google’s Developer Registration Decree’, F-Droid blog, 29 September 2025, article on Google’s new worldwide developer registration requirement, https://f-droid.org/en/2025/09/29/google-developer-registration-decree.html. ↩
(41) European Digital Rights (EDRi), ‘Press Release: Commission’s Digital Omnibus Is a Major Rollback of EU Digital Protections’, 19 November 2025, https://edri.org/our-work/commissions-digital-omnibus-is-a-major-rollback-of-eu-digital-protections/. ↩
(42) Frederike Kaltheuner, ‘Europe’s Sovereignty Debate Has a Blind Spot: The AI Bubble’, LinkedIn post, 25 November 2025, https://www.linkedin.com/posts/frederike-kaltheuner_europes-sovereignty-debate-has-a-blind-spot-activity-7398993891506561024-h5dK. ↩